Skip to main content

DataFusionSharp

CI .NET Rust License

.NET bindings for Apache DataFusion, a fast, extensible query engine built on Apache Arrow for high-performance analytical query processing.

Note: This is an independent community project and is not officially associated with or endorsed by the Apache Software Foundation or the Apache DataFusion project.

Features

  • SQL queries — Execute SQL against CSV, Parquet, and JSON files
  • Apache Arrow — Zero-copy data exchange via the Arrow columnar format
  • Object stores — Query data on S3, Azure Blob Storage, Google Cloud Storage, or local filesystem
  • Hive partitioning — Read and write Hive-style partitioned datasets
  • Parameterized queries — Bind named parameters with type-safe ScalarValue types
  • Cross-platform — Linux (x64/arm64), Windows (x64), macOS (arm64)

Quick Install

dotnet add package DataFusionSharp

Minimal Example

using DataFusionSharp;

using var runtime = DataFusionRuntime.Create();
using var context = runtime.CreateSessionContext();

await context.RegisterCsvAsync("orders", "orders.csv");

using var df = await context.SqlAsync("SELECT customer_id, sum(amount) AS total FROM orders GROUP BY customer_id");
await df.ShowAsync();

Next Steps