Loading comparison...
Loading comparison...
Enterprise Control Language for the HPCC Systems big data platform. Declarative language for data-intensive computing.
ECL (Enterprise Control Language) is a declarative, data-centric programming language designed for the HPCC Systems (High Performance Computing Cluster) big data platform, originally developed by LexisNexis Risk Solutions in the early 2000s. ECL was created to enable non-traditional programmers, particularly data analysts and researchers, to express complex data transformations, linkage operations, and analytics on massive datasets without managing low-level distributed computing details. The language combines declarative data flow definitions with a functional programming style, where data transformations are expressed as operations on datasets — SORT, DEDUP, JOIN, ROLLUP, PROJECT, and DISTRIBUTE — that the ECL compiler optimizes and parallelizes across cluster nodes.
HPCC Systems provides two processing engines: Thor for batch ETL processing (similar to Hadoop MapReduce) and Roxie for real-time query serving, both programmable through ECL. The platform has been used to process petabytes of data in industries including healthcare analytics, financial risk assessment, government data matching, and identity resolution. ECL supports user-defined functions, macros, modules for code organization, and SOAP/HTTP service definitions for exposing analytics as web services.
The language's record-oriented type system defines data schemas that flow through transformation pipelines, with the compiler performing whole-program optimization to generate efficient C++ code executed across the cluster. While less widely known than SQL or Python-based big data tools, ECL and HPCC Systems power some of the largest data processing operations in the world, particularly in the legal, risk, and compliance sectors where LexisNexis operates.
ECL workflow changes affect large-scale data processing pipelines where a modified JOIN condition or TRANSFORM function can produce incorrect results across billions of records. ETL logic errors in ECL may not surface until downstream reports reveal data quality issues.
Comparing ECL files catches changes to dataset schemas, modified deduplication rules, and altered sort orders that affect data linkage accuracy. Data engineering teams must review ECL diffs carefully before deploying to production clusters processing sensitive financial or healthcare data.
UtraDiff compares ECL files with syntax highlighting that distinguishes RECORD definitions, TRANSFORM functions, OUTPUT actions, and dataset declarations. Side-by-side view reveals how data pipeline transformations change between versions, while inline view consolidates complex JOIN and SORT expression modifications.
Keywords, field references, and string literals are color-coded so structural logic changes stand out from parameter tweaks. Alt+arrow navigation moves between changed definitions quickly.
Supported extensions: .ecl