侧边栏壁纸
  • 累计撰写 6 篇文章
  • 累计创建 15 个标签
  • 累计收到 1 条评论

目 录CONTENT

文章目录

SplitQuery? 笛卡尔爆炸?

komi
2023-11-24 / 0 评论 / 0 点赞 / 362 阅读 / 1,403 字

起因

在最近经手的项目中发现了一个奇怪的配置

case "mssql":
    options.UseSqlServer(config.MsSqlConnection, o => o.UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery));
    break;

这里的SplitQuery指的是什么? 进源码中看下发现是一个枚举类

/// <summary>
///     Indicates how the related collections in a query should be loaded from database.
/// </summary>
/// <remarks>
///     See <see href="https://aka.ms/efcore-docs-split-queries">EF Core split queries</see> for more information and examples.
/// </remarks>
public enum QuerySplittingBehavior
{
    /// <summary>
    ///     The related collections will be loaded in same database query as parent query.
    /// </summary>
    /// <remarks>
    ///     This behavior generally guarantees result consistency in the face of concurrent updates
    ///     (but details may vary based on the database and transaction isolation level in use).
    ///     However, this can cause performance issues when the query loads multiple related collections.
    /// </remarks>
    SingleQuery = 0,

    /// <summary>
    ///     The related collections will be loaded in separate database queries from the parent query.
    /// </summary>
    /// <remarks>
    ///     This behavior can significantly improve performance when the query loads multiple collections.
    ///     However, since separate queries are used, this can result in inconsistent results when concurrent updates occur.
    ///     Serializable or snapshot transactions can be used to mitigate this
    ///     and achieve consistency with split queries, but that may bring other performance costs and behavioral difference.
    /// </remarks>
    SplitQuery
}

正好项目中用到的了EF Core 那说明这个配置是和EF Core的某个特性是相关的 经过一番查找 从MSDN中找到了相关的资料

When working against relational databases, EF loads related entities by introducing JOINs into a single query. While JOINs are quite standard when using SQL, they can create significant performance issues if used improperly.

从MSDN中的原话里可以看出 EF在处理联表查询的时候默认使用一整句的JOIN 但是这样有可能会造成性能上的问题

笛卡尔爆炸

在讲笛卡尔爆炸这个概念之前 先引入笛卡尔乘积这样一个概念

笛卡儿积在通俗地来讲指的是从两个集合(Set)中的元素组成新的配对集合 以麦当劳套餐来比喻
门店将汉堡线和饮品线上的每个产品集合组成一个新的套餐会有多少种套餐

而笛卡尔积在数据库中的表现形式正是联表查(join)操作 两个表在数据量不是很大的情况下查询来讲可能对性能影响模棱两可 但是对于一些因业务需求日益增加列的大宽表以及数据存量过大的表来讲就会产生查询过慢以及数据冗余的问题

场景模拟

开发环境

  • Sql Server 2022 Dev 16.0.1105.1
  • .Net 7
  • Microsoft.EntityFrameworkCore.SqlServer 7.0.14
  • Microsoft.EntityFrameworkCore.SqlServer 7.0.14

表脚本

产品表

CREATE TABLE [dbo].[Prod](
	[productName] [varchar](20) NULL,
	[price] [decimal](18, 0) NULL,
	[description] [text] NULL,
	[product_id] [bigint] IDENTITY(0,1) NOT NULL,
 CONSTRAINT [prim_key] PRIMARY KEY CLUSTERED 
(
	[product_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

订单表

CREATE TABLE [dbo].[Ord](
	[product_id] [bigint] NULL,
	[quantity] [int] NULL,
	[ord_id] [bigint] IDENTITY(0,1) NOT NULL,
	[customer_id] [bigint] NULL,
 CONSTRAINT [ord_prim_key] PRIMARY KEY CLUSTERED 
(
	[ord_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO

客户表

CREATE TABLE [dbo].[Customer](
	[customer_id] [bigint] IDENTITY(0,1) NOT NULL,
	[name] [varchar](50) NULL,
 CONSTRAINT [PK_Customer] PRIMARY KEY CLUSTERED 
(
	[customer_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO

程序

DbContext

    public class DemDbContext : DbContext
    {
        public DemDbContext(DbContextOptions<DemDbContext> options)
            : base(options)
        {
        }

        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
            modelBuilder.Entity<Customer>().HasMany(e => e.Orders).WithOne(e => e.Customer).HasForeignKey(x => x.customer_id);
            modelBuilder.Entity<Prod>().HasMany(e => e.Orders).WithOne(e => e.Product).HasForeignKey(x => x.product_id);
        }

        public DbSet<Ord> Ords { get; set; }

        public DbSet<Prod> Prods { get; set; }

        public DbSet<Customer> Customers { get; set; }
    }

主程序

using dem;
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Logging;

public class Program
{
    /// <summary>
    /// 应用程序配置对象
    /// </summary>
    public static IConfigurationRoot Configuration { get; private set; }

    /// <summary>
    /// 数据库配置对象
    /// </summary>
    public static DbContextOptions<DemDbContext> DbContextOptions { get; private set; }

    /// <summary>
    /// 获取实体上下文
    /// </summary>
    public static DemDbContext DbContext => new(DbContextOptions);

    private static void Main(string[] args)
    {
        // 初始化
        Configuration = InitConfiguration();
        DbContextOptions = InitDatabase(Configuration);

        var ords = DbContext.Prods.Include(p => p.Orders).ToList();
    }

    /// <summary>
    /// 初始化数据库配置
    /// </summary>
    /// <param name="configuration">应用程序配置</param>
    /// <returns></returns>
    public static DbContextOptions<DemDbContext> InitDatabase(IConfigurationRoot configuration)
    {
        var connectionString = configuration.GetConnectionString("LocalMSSql");
        var optionsBuilder = new DbContextOptionsBuilder<DemDbContext>();

        var logging = LoggerFactory.Create(b =>
        {
            b.AddConsole();
        });

        optionsBuilder
            .UseSqlServer(connectionString, options =>
        {
            options.UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery);
        })
            .UseLoggerFactory(logging);


        var dbContextOptions = optionsBuilder.Options;

        return dbContextOptions;
    }

    /// <summary>
    /// 初始化应用程序配置文件
    /// </summary>
    /// <returns></returns>
    public static IConfigurationRoot InitConfiguration()
    {
        var basePath = @"G:\work\dem\dem";

        var configBuilder = new ConfigurationBuilder()
            .SetBasePath(basePath)
            .AddJsonFile("appsettings.json", optional: false, reloadOnChange: true)
            .AddEnvironmentVariables();

        var configuration = configBuilder.Build();

        Console.WriteLine("配置文件:");
        Console.WriteLine(Path.Combine(basePath, "appsettings.json"));

        return configuration;
    }
}

结果生成

  1. 使用SingleQuery
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (15ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [p].[product_id], [p].[description], [p].[price], [p].[productName], [o].[ord_id], [o].[customer_id], [o].[product_id], [o].[quantity]
      FROM [Prod] AS [p]
      LEFT JOIN [Ord] AS [o] ON [p].[product_id] = [o].[product_id]
      ORDER BY [p].[product_id]
  1. 使用SplitQuery
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (15ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [p].[product_id], [p].[description], [p].[price], [p].[productName]
      FROM [Prod] AS [p]
      ORDER BY [p].[product_id]
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (1ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT [o].[ord_id], [o].[customer_id], [o].[product_id], [o].[quantity], [p].[product_id]
      FROM [Prod] AS [p]
      INNER JOIN [Ord] AS [o] ON [p].[product_id] = [o].[product_id]
      ORDER BY [p].[product_id]

可以看到这里EF Core在最后的生成策略上是不同的 使用SplitQuery避免了使用SingleQuery的left join而产生的冗余数据

参考

https://learn.microsoft.com/en-us/ef/core/querying/single-split-queries

https://learn.microsoft.com/en-us/sql/relational-databases/performance/joins?view=sql-server-ver16

0

评论区