DataHub Releases
Summary
Version | Release Date | Links |
---|---|---|
v0.10.1 | 2023-03-23 | Release Notes, View on GitHub |
v0.10.0 | 2023-02-07 | Release Notes, View on GitHub |
v0.9.6.1 | 2023-01-31 | Release Notes, View on GitHub |
v0.9.6 | 2023-01-13 | Release Notes, View on GitHub |
v0.9.5 | 2022-12-23 | View on GitHub |
v0.9.4 | 2022-12-20 | View on GitHub |
v0.9.3 | 2022-11-30 | View on GitHub |
v0.9.2 | 2022-11-04 | View on GitHub |
v0.9.1 | 2022-10-31 | View on GitHub |
v0.9.0 | 2022-10-11 | View on GitHub |
v0.8.45 | 2022-09-23 | View on GitHub |
v0.8.44 | 2022-09-01 | View on GitHub |
v0.8.43 | 2022-08-09 | View on GitHub |
v0.8.42 | 2022-08-03 | View on GitHub |
v0.8.41 | 2022-07-15 | View on GitHub |
v0.8.40 | 2022-06-30 | View on GitHub |
v0.8.39 | 2022-06-24 | View on GitHub |
v0.8.38 | 2022-06-09 | View on GitHub |
v0.8.37 | 2022-06-09 | View on GitHub |
v0.8.36 | 2022-06-02 | View on GitHub |
v0.8.35 | 2022-05-18 | View on GitHub |
v0.8.34 | 2022-05-04 | View on GitHub |
v0.8.33 | 2022-04-15 | View on GitHub |
v0.8.32 | 2022-04-04 | View on GitHub |
v0.8.31 | 2022-03-17 | View on GitHub |
v0.8.30 | 2022-03-17 | View on GitHub |
v0.8.29 | 2022-03-10 | View on GitHub |
v0.8.28 | 2022-03-07 | View on GitHub |
DataHub v0.10.1
Released on 2023-03-23 by @aditya-radhakrishnan.
Release Highlights
User Experience
- The Queries Tab has a new look - supports manually adding and annotating queries directly from the UI, making it easier to share trusted SQL logic with others
- Glossary Terms now shows “Contained by" and "Inherited by" relationships
- Resolved issues with Download to CSV for large volumes of entities
- Update to the Analytics tab - view Monthly Active users to keep track of DataHub adoption and activity within your organization
- Ongoing UI optimizations focused on improve navigation experience
Metadata Ingestion
BigQuery
- Improvements to memory usage during metadata extraction
- Ingestion now captures Dataset Labels
- Emit cross-project usage
PowerBI
- Support for Platform Instance and uniquely identify multiple instances of the same Platform
- Support for PowerBI <> (Redshift, BigQuery) lineage extraction
- Extract entity descriptions
Miscellaneous
- DataHub Integrations Catalog to quickly filter and search for supported integrations
- Kafka Connect - support for stateful ingestion & lowercasing URNs
- Snowflake: improvements to memory usage during metadata extraction
- Postgres: supports estimated row counts during profiling
- Fix to dbt ingestion to address inconsistent upper/lower casing
- S3 ingestion now supports path_specs of multiple buckets in the same recipe
- Looker: Upgrade Looker API from 3.1 to 4.0
- Great Expectations: support for lowercasing URNs
- Tableau: Support for Project Path & Containers; ingestion more resilient to timeout exceptions
Developer Experience
Miscellaneous
- Neo4j support for lineage time filter
- Metadata model support for JSON schemas stored in Files, Directories, and Kafka Schema Registry
- Timeline API now supports Glossary Terms
- Improvements to startup time for DataHub CLI
API Docs & Guides
- Table of contents to understand DataHub APIs at a glance
- Guides:
- Add Tags, Terms, Owners to entities
- Create datasets
- Manage Lineage
Search Improvements
- searchAcrossEntities/Lineage improvements
- support searchAfter
- advanced query, identity autocomplete, exact match weight
Breaking Changes
What's Changed
- fix(test): cleanup test on setup error by @david-leifker in https://github.com/datahub-project/datahub/pull/7259
- feat(cli): add 0.10 awareness to upgrade prompt by @shirshanka in https://github.com/datahub-project/datahub/pull/7273
- chore(ci): cleanup build to remove dependencies duckdb, dev by @anshbansal in https://github.com/datahub-project/datahub/pull/7267
- feat(oidc): add options for preferred jws algorithm by @david-leifker in https://github.com/datahub-project/datahub/pull/7245
- chore(cypress): upgrade cypress to latest v12.5.1 by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7276
- fix(ingest/bigquery) - Fix for Bigquery parser quoted semicolon in the FROM table name as well by @treff7es in https://github.com/datahub-project/datahub/pull/7277
- chore(ci): ensure kafka setup runs for smoke tests by @anshbansal in https://github.com/datahub-project/datahub/pull/7278
- feat(ingest/bigquery) - Reporting current state of BigQuery ingestion by @treff7es in https://github.com/datahub-project/datahub/pull/7282
- feat(graphql): enabling graphql for data platform instance aspects by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/7177
- feat(api): Timeline API supports Glossary Terms now by @vojtechneradatos in https://github.com/datahub-project/datahub/pull/7229
- getting rid of build locally(broken) for ./gradlew quickstart(working) by @laulpogan in https://github.com/datahub-project/datahub/pull/7283
- chore(ci): remove redundant quickstart check by @anshbansal in https://github.com/datahub-project/datahub/pull/7286
- Update smoke.sh by @david-leifker in https://github.com/datahub-project/datahub/pull/7284
- docs(release notes): Managed DataHub v0.2.0 release notes by @david-leifker in https://github.com/datahub-project/datahub/pull/7299
- docs(release): updating docs per release process by @david-leifker in https://github.com/datahub-project/datahub/pull/7281
- doc(access): move heading above the images by @anshbansal in https://github.com/datahub-project/datahub/pull/7291
- fix(docs): kafka - update docs to indicate protobuf support by @shirshanka in https://github.com/datahub-project/datahub/pull/7280
- fix(system-update): fixes system-update with more than 1 partition by @david-leifker in https://github.com/datahub-project/datahub/pull/7302
- fix(ui): fix styling on sign up and reset screens by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7301
- fix(cypress): fix broken cypress tag tests by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7306
- chore(ci): speed up ingestion test runs by @anshbansal in https://github.com/datahub-project/datahub/pull/7296
- docs(release notes): Update updating-datahub.md by @david-leifker in https://github.com/datahub-project/datahub/pull/7311
- fix(ingest/bigquery): Usage rate limiting and lineage exported log fix by @treff7es in https://github.com/datahub-project/datahub/pull/7297
- fix(bootstrap): do not re-run retention policy ingestion by @anshbansal in https://github.com/datahub-project/datahub/pull/7295
- refactor(github): change github reference to git references by @anshbansal in https://github.com/datahub-project/datahub/pull/7308
- fix(datahub-upgrade): allow registry override by @david-leifker in https://github.com/datahub-project/datahub/pull/7258
- feat(cli): improve startup time by @hsheth2 in https://github.com/datahub-project/datahub/pull/7292
- fix(search): correctly filter fields in EDITABLE_FIELD_TO_QUERY_PAIRS with a list of values by @jinlintt in https://github.com/datahub-project/datahub/pull/7303
- fix(ingest/bigquery) Lowering significantly the memory usage of the BigQuery connector by @treff7es in https://github.com/datahub-project/datahub/pull/7315
- chore(ingest): upgrade to mypy 1.0.0 by @hsheth2 in https://github.com/datahub-project/datahub/pull/7313
- fix(tests): Remove rollback-reports, add to ignore by @david-leifker in https://github.com/datahub-project/datahub/pull/7312
- perf(ingest): speed up MCPW.validate() by @hsheth2 in https://github.com/datahub-project/datahub/pull/7319
- fix(ingest/bigquery): Fix for table cache was not cleared by @treff7es in https://github.com/datahub-project/datahub/pull/7323
- fix(ingest/bigquery): Improve memory usage of lineage extraction by @treff7es in https://github.com/datahub-project/datahub/pull/7326
- docs(): Adding notebook support disclaimer by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7327
- fix(docs): sort sources by display name in doc's sidebar by @Masterchen09 in https://github.com/datahub-project/datahub/pull/7322
- fix(transformers): pattern add domain transformer - enable replace_existing by @asikowitz in https://github.com/datahub-project/datahub/pull/7317
- fix(ci): remove command from cache key as irrelevant for dependency by @anshbansal in https://github.com/datahub-project/datahub/pull/7314
- fix(check upgrade): update logic to compare server and client version by @mayurinehate in https://github.com/datahub-project/datahub/pull/7238
- fix(tracking): Remove 'title' field from tracking by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7328
- fix(homepage): make entity counts execute in parallel and make cache configurable by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7249
- docs(delete): cleanup removed option by @anshbansal in https://github.com/datahub-project/datahub/pull/7335
- feat(ingestion): powerbi # Configurable Admin API by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7055
- fix(sso) Retrieve cookie configs separately from SSO configs by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7330
- logging(cli): dropping neo4j message to debug to avoid confusion by @shirshanka in https://github.com/datahub-project/datahub/pull/7340
- perf(matadata-io): neo4j generateLineageStatement use shortestPath by @shidianshifen in https://github.com/datahub-project/datahub/pull/7219
- fix(tableau): make Tableau ingestor resilient to timeout exceptions by @skrydal in https://github.com/datahub-project/datahub/pull/7333
- chore(ci): mark tests correctly by @anshbansal in https://github.com/datahub-project/datahub/pull/7337
- refactor(upgrade): Trim upgrade name before executing by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7343
- fix(ui) Update styles of embedded profile page to match designs by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7348
- fixed links and improved recommendations by @laulpogan in https://github.com/datahub-project/datahub/pull/7334
- gradle(development): add additional commands for development by @david-leifker in https://github.com/datahub-project/datahub/pull/7321
- fix(search): support searchFlags for GraphQL by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7346
- fix(elasticsearch): make alias creation atomic by @david-leifker in https://github.com/datahub-project/datahub/pull/7332
- Saas docs migration by @laulpogan in https://github.com/datahub-project/datahub/pull/6603
- removing local airflow from sidebar and adding a warning at the top by @laulpogan in https://github.com/datahub-project/datahub/pull/7331
- development(docker): add flag to gradle for quickstart by @david-leifker in https://github.com/datahub-project/datahub/pull/7355
- fix(gradle): fix gradle command referenced in docs by @david-leifker in https://github.com/datahub-project/datahub/pull/7318
- fix(ingest/bigquery): Increase batch size in metadata extraction if no partitioned table involved by @treff7es in https://github.com/datahub-project/datahub/pull/7252
- feat(cli): make deprecations, renames easier to notice by @anshbansal in https://github.com/datahub-project/datahub/pull/7310
- fix(cli): Corrects search filter for delete by @pedro93 in https://github.com/datahub-project/datahub/pull/7367
- fix(ingestion/snowflake): Fixing stateful ingestion commit at Snowflake source by @treff7es in https://github.com/datahub-project/datahub/pull/7363
- fix(ingestion): powerbi # continue ingestion if m-query parsing fail by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7360
- feat: add chart entities to similar browsepath as dashboards by @looppi in https://github.com/datahub-project/datahub/pull/7293
- fix(lineage): Include maxHops in Lineage Cache Key + misc UI improvements by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7351
- refactor(ingest,athena): update athena sample recipe by @bossenti in https://github.com/datahub-project/datahub/pull/7368
- fix(ingest/looker): do not instantiate LookerDashboardSource on test_connection by @asikowitz in https://github.com/datahub-project/datahub/pull/7369
- fix(deps): pin snowflake-connector-python by @asikowitz in https://github.com/datahub-project/datahub/pull/7365
- feat(ingest): json-schema - add json schema support for files and kaf… by @shirshanka in https://github.com/datahub-project/datahub/pull/7361
- docs: fix broken link by @sandertan in https://github.com/datahub-project/datahub/pull/7344
- chore(versions): bump versions by @david-leifker in https://github.com/datahub-project/datahub/pull/7358
- test(cli): add check for missing init files by @anshbansal in https://github.com/datahub-project/datahub/pull/7378
- fix(ingest/snowflake): Improve memory usage of metadata extraction by @asikowitz in https://github.com/datahub-project/datahub/pull/7349
- feat(elasticsearch): advanced query, identity autocomplete, exact match weight by @david-leifker in https://github.com/datahub-project/datahub/pull/7354
- feat(queries): Overhaul Queries Tab by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7366
- chore(version): additional version bumps & suppressions by @david-leifker in https://github.com/datahub-project/datahub/pull/7382
- fix(lineage): Fix Upstream + Downstream Count in presence of Soft-Deleted / Non-Existent references by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7374
- fix(dep/json-schema): Fixing json-schema dependencies by @treff7es in https://github.com/datahub-project/datahub/pull/7383
- feat(analytics): add monthly active users in highlights by @anshbansal in https://github.com/datahub-project/datahub/pull/7341
- fix(search): fix search filters, handle detection of keyword subfield by @david-leifker in https://github.com/datahub-project/datahub/pull/7372
- chore(bump): bump hadoop client and fix exclusion name by @david-leifker in https://github.com/datahub-project/datahub/pull/7386
- build(scan): enable trivy scan ingestion-base by @david-leifker in https://github.com/datahub-project/datahub/pull/7389
- chore(ci): relax bigquery dependency by @anshbansal in https://github.com/datahub-project/datahub/pull/7309
- build(idea): mark metadata-ingestion sources and tests by @asikowitz in https://github.com/datahub-project/datahub/pull/7394
- docs(website): Add airtel logo by @jeffmerrick in https://github.com/datahub-project/datahub/pull/7395
- fix(ingest/oracle) add database name to oracle urn name by @jaegwonseo in https://github.com/datahub-project/datahub/pull/7016
- fix(docs) Update transformers docs to note not minting urns by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7399
- chore(ci): update base dependencies by @anshbansal in https://github.com/datahub-project/datahub/pull/7390
- fix(search): exact match updates per review by @david-leifker in https://github.com/datahub-project/datahub/pull/7385
- fix(ingest): Do not require platform_instance for stateful ingestion by @asikowitz in https://github.com/datahub-project/datahub/pull/7397
- Dockerize updates by @david-leifker in https://github.com/datahub-project/datahub/pull/7387
- fix(ingest/bigquery): Correctly upsert lineage_map when parsing view ddl by @asikowitz in https://github.com/datahub-project/datahub/pull/7403
- feat(timeBasedLineage): add feature flag for always producing MCL by @pedro93 in https://github.com/datahub-project/datahub/pull/7407
- fix(ingest/bigquery): Prefer parsed lineage for view over lineage from audit logs by @mayurinehate in https://github.com/datahub-project/datahub/pull/7408
- Update README.md by @amartinson193 in https://github.com/datahub-project/datahub/pull/7400
- docs(logo) add VanMoof logo to site by @maggiehays in https://github.com/datahub-project/datahub/pull/7402
- feat(ingest/kafka-connect): add config to lowercase urns, do not emit… by @mayurinehate in https://github.com/datahub-project/datahub/pull/7393
- feat(frontend): add additional tabs to glossary terms view by @alexey-kravtsov in https://github.com/datahub-project/datahub/pull/7392
- feat(auth): REST API authorization by @RyanHolstien in https://github.com/datahub-project/datahub/pull/6614
- fix(ingest/kafka): Remove topic from browse path by @asikowitz in https://github.com/datahub-project/datahub/pull/7398
- feat(ingest/bigquery) - Emit cross-project usage from gcp logs by @treff7es in https://github.com/datahub-project/datahub/pull/7364
- feat(elasticsearch): support searchAfter by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7235
- docs(managed datahub): release notes for v0.2.1 by @anshbansal in https://github.com/datahub-project/datahub/pull/7414
- fix(frontend) support utf-8 charset by @lutongzero in https://github.com/datahub-project/datahub/pull/7405
- fix(ingest/bigquery) Filter upstream lineage by list of existing tables by @asikowitz in https://github.com/datahub-project/datahub/pull/7415
- refactor(ingest): lookml - fix up golden files in normalized form by @shirshanka in https://github.com/datahub-project/datahub/pull/7423
- fix(ingest/bigquery): Fixing double quoting in profiling approx count query by @treff7es in https://github.com/datahub-project/datahub/pull/7416
- fix(ingest): lookml - add support for includes, extends, view_name i… by @shirshanka in https://github.com/datahub-project/datahub/pull/7428
- fix(recommendations): fix recommendations on homepage by @david-leifker in https://github.com/datahub-project/datahub/pull/7433
- docs(website): fix homepage logo sizing by @jeffmerrick in https://github.com/datahub-project/datahub/pull/7430
- feat(queries): Adding Tooltips to Queries Tab by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7421
- fix(analytics): remove zero values being added in charts by @anshbansal in https://github.com/datahub-project/datahub/pull/7425
- docs(ingest): add ingestion configs guide by @hsheth2 in https://github.com/datahub-project/datahub/pull/7438
- fix(ingest/bigquery): Querying table metadata details in batch properly by @treff7es in https://github.com/datahub-project/datahub/pull/7429
- fix(ingest/snowflake): fixing Snowflake state issue by @treff7es in https://github.com/datahub-project/datahub/pull/7443
- refactor(tests): extract common code by @anshbansal in https://github.com/datahub-project/datahub/pull/7441
- fix date ranges being queried in charts by @anshbansal in https://github.com/datahub-project/datahub/pull/7444
- feat(tests): allow use of system auth for test session by @anshbansal in https://github.com/datahub-project/datahub/pull/7445
- fix(ingest/athena): Fix athena source if dbname is not specified in the connection string by @treff7es in https://github.com/datahub-project/datahub/pull/7417
- fix(lineage): Fixing Timeline Lineage Filters by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7435
- fix(ingest/unity): Use assigned metastore if not metastore listed in unity catalog by @treff7es in https://github.com/datahub-project/datahub/pull/7446
- chore(ingest): cleanup unused files/vars in tests by @hsheth2 in https://github.com/datahub-project/datahub/pull/7450
- Feat/s3 ingestion enhancement to update schema from latest partition by @nachiket-juneja in https://github.com/datahub-project/datahub/pull/7410
- chore(ingest/glue): cleanup deprecated
underlying_platform
config by @hsheth2 in https://github.com/datahub-project/datahub/pull/7449 - refactor(ingest): avoid allowing extras for all DataHubGraphConfig by @hsheth2 in https://github.com/datahub-project/datahub/pull/7448
- docs(ingest): add more guidelines for writing sources by @hsheth2 in https://github.com/datahub-project/datahub/pull/7451
- fix(smoke): add missing test resource by @hsheth2 in https://github.com/datahub-project/datahub/pull/7455
- refactor(ingest): subtypes - standardize by @shirshanka in https://github.com/datahub-project/datahub/pull/7437
- docs(ingest): add details about backwards compatibility guarantees by @hsheth2 in https://github.com/datahub-project/datahub/pull/7439
- fix(ui) Merge duplicate schema fields on siblings regardless of casing by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7413
- fix(kafka-setup): configure sasl.mechanism in case SASL_PLAINTEXT by @k-popov in https://github.com/datahub-project/datahub/pull/7447
- docs(managed): v0.2.2 managed datahub release notes by @david-leifker in https://github.com/datahub-project/datahub/pull/7456
- chore(ci): exclude duckdb from smoke test by @anshbansal in https://github.com/datahub-project/datahub/pull/7458
- fix(ingest/bigquery): simplify type annotations for bigquery usage by @hsheth2 in https://github.com/datahub-project/datahub/pull/7457
- feat(ingest): Introduce FileBackedDict for offloading data to disk by @asikowitz in https://github.com/datahub-project/datahub/pull/7461
- fix(ingest/dbt): remove deprecated
backcompat_skip_source_on_lineage_edge
option by @hsheth2 in https://github.com/datahub-project/datahub/pull/7466 - refactor(ingest): use auto_stale_entity_removal in json schema source by @hsheth2 in https://github.com/datahub-project/datahub/pull/7465
- fix(ingest/bigquery): update bigquery platform_instance capability by @TonyOuyangGit in https://github.com/datahub-project/datahub/pull/7467
- fix(ingest/s3): propagate s3 endpoint to profiling by @tmemenga in https://github.com/datahub-project/datahub/pull/7431
- fix(ingest): remove extraneous platform configs by @hsheth2 in https://github.com/datahub-project/datahub/pull/7454
- feat(ingest/bigquery) - Capture dataset labels in bigquery by @treff7es in https://github.com/datahub-project/datahub/pull/7460
- Add setup job labels to compose files by @szalai1 in https://github.com/datahub-project/datahub/pull/7473
- chore(ci): upgrade GE version by @anshbansal in https://github.com/datahub-project/datahub/pull/7290
- fix(ingest/dbt): check for nodes key before accessing by @khgould in https://github.com/datahub-project/datahub/pull/7462
- fix(search): per field analyzers for simple_query_string by @david-leifker in https://github.com/datahub-project/datahub/pull/7436
- tests(cypress): add improved Cypress tests for timeline lineage by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7464
- fix(ui) Standardize subtypes casing with View Definition tab by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7477
- feat(elasticsearch): validate index.blocks.write setting by @david-leifker in https://github.com/datahub-project/datahub/pull/7478
- feat(ingest/tableau): project path and container support by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7426
- refactor(ingest): Convert FileBackedDict to dataclass for cleaner init by @asikowitz in https://github.com/datahub-project/datahub/pull/7469
- chore(ingest): pin acryl-datahub-classify by @hsheth2 in https://github.com/datahub-project/datahub/pull/7485
- fix(ingest/tableau): load project workbook hierarchy correctly by @hsheth2 in https://github.com/datahub-project/datahub/pull/7483
- fix(ingest): redact auth info in curl commands by @hsheth2 in https://github.com/datahub-project/datahub/pull/7496
- fix(ingest): prevent logging from blowing up on TypeErrors by @hsheth2 in https://github.com/datahub-project/datahub/pull/7497
- fix(ui) Make tooltip on search results stats summary clearer by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7492
- fix(ui) Fix UI flickering when switching between glossary entities by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7432
- feat(ingest/GX): add urn lowercasing option for GX assertions by @mayurinehate in https://github.com/datahub-project/datahub/pull/7472
- feat(cli): introduce remote config for quickstart by @szalai1 in https://github.com/datahub-project/datahub/pull/7424
- feat(ingestion): powerbi # support Google BigQuery table lineage by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7502
- feat(ingest): unbundle airflow plugin emitter dependencies by @cburroughs in https://github.com/datahub-project/datahub/pull/7493
- feat(cli): finalizing quickstart config commit hash by @szalai1 in https://github.com/datahub-project/datahub/pull/7509
- feat(ingest/postgres): support estimated row counts in profiling by @arunvasudevan in https://github.com/datahub-project/datahub/pull/7476
- fix(ingest/bigquery): fix missing materialized views by @mayurinehate in https://github.com/datahub-project/datahub/pull/7511
- fix(ingest): make quickstart error handling more robust by @hsheth2 in https://github.com/datahub-project/datahub/pull/7513
- fix(ingest): limit typing_extensions classes to those available in min version by @cburroughs in https://github.com/datahub-project/datahub/pull/7490
- feat(ingest/vertica): improve vertica type mappings by @NotYuki in https://github.com/datahub-project/datahub/pull/7459
- chore(ingest): remove unused dependency for bigquery by @mayurinehate in https://github.com/datahub-project/datahub/pull/7510
- feat(ingest/looker): upgrade to Looker API from 3.1 to 4.0 by @feljen in https://github.com/datahub-project/datahub/pull/7411
- Docs update by @szalai1 in https://github.com/datahub-project/datahub/pull/7517
- feat(graphql): Added GraphQL mappings for the "created" and "lastModified" fields in "DatasetProperties" aspect by @siladitya2 in https://github.com/datahub-project/datahub/pull/7463
- docs(guidelines) Update community guidelines by @maggiehays in https://github.com/datahub-project/datahub/pull/7518
- fix(ui-ingestion) Fix UI manual ingestion runs by consistently setting pipeline_name by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7521
- feat(docs-website): support category links by @hsheth2 in https://github.com/datahub-project/datahub/pull/7516
- feat(ingest/powerbi): support PowerBI parameter references by @hsheth2 in https://github.com/datahub-project/datahub/pull/7523
- feat(ingest): enable joins across FileBackedDicts + add FileBackedList by @hsheth2 in https://github.com/datahub-project/datahub/pull/7506
- fix(): Fix Query Detail Modal Scroll + add misc log messages by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7530
- fix(frontend proxy): Disable unnecessary URL encoding at the proxy layer by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7532
- fix(ingest): delta-lake - support assume aws role by @shirshanka in https://github.com/datahub-project/datahub/pull/7524
- docs(ingest): add guidelines around proactive version pinning by @hsheth2 in https://github.com/datahub-project/datahub/pull/7534
- docs(): add sources summary page by @laulpogan in https://github.com/datahub-project/datahub/pull/7480
- fix(grafana): use variable datasource uid by @maaaikoool in https://github.com/datahub-project/datahub/pull/7488
- fix(ingest/looker): stringify looker user ids by @hsheth2 in https://github.com/datahub-project/datahub/pull/7531
- Revert "docs(): add sources summary page" by @laulpogan in https://github.com/datahub-project/datahub/pull/7546
- feat(openapi): add relationships endpoint by @shirshanka in https://github.com/datahub-project/datahub/pull/7547
- Add documentation example for using restoreIndices with an urnLike argument by @iprentic in https://github.com/datahub-project/datahub/pull/7544
- fix(ingest/snowflake): bump up classification library version to 0.0.6 by @mayurinehate in https://github.com/datahub-project/datahub/pull/7542
- fix(test): suppress s3 golden file test for specific paths by @shirshanka in https://github.com/datahub-project/datahub/pull/7551
- fix(docs-website): reflect PythonSDK & GraphQL Docs changes by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7557
- feat(search): searchAcrossEntities/Lineage improvements by @david-leifker in https://github.com/datahub-project/datahub/pull/7550
- docs(): re-add sources summary page by @laulpogan in https://github.com/datahub-project/datahub/pull/7563
- Update restore indices docs to include batch information by @iprentic in https://github.com/datahub-project/datahub/pull/7564
- feat(ingest): fix edge cases + interface cleanup for file-system APIs by @hsheth2 in https://github.com/datahub-project/datahub/pull/7533
- feat(ingest): powerbi # store powerbi entity descriptions by @looppi in https://github.com/datahub-project/datahub/pull/7154
- fix(cli): Adding exit code to correctly return failure or success by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7520
- feat(cli): switch default quickstart to v0.10.0 by @hsheth2 in https://github.com/datahub-project/datahub/pull/7567
- chore(ci): try Qodana Scan for quality by @anshbansal in https://github.com/datahub-project/datahub/pull/7560
- chore(ci): add daylight savings timezone for tests, fix daylight saving bug in analytics charts by @anshbansal in https://github.com/datahub-project/datahub/pull/7484
- fix(lineage): nullpointer exceptions by @anshbansal in https://github.com/datahub-project/datahub/pull/7577
- docs(managed ingestion): add release notes for v0.2.3 by @anshbansal in https://github.com/datahub-project/datahub/pull/7578
- fix(logging): increase log level for system-upgrade job to complete before starting by @iprentic in https://github.com/datahub-project/datahub/pull/7566
- fix(ui) Safeguard ingestion execution request check by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7584
- refactor(ui): Separate entity lineage counts query from rest of entity query by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7569
- feat(ingest/snowflake): use auto_workunit_reporter helper by @hsheth2 in https://github.com/datahub-project/datahub/pull/7568
- feat(ingest/kafka-connect): add stateful ingestion and platform instance support by @mayurinehate in https://github.com/datahub-project/datahub/pull/7526
- fix(gms): convert obj to string, fix wrong setup by @anshbansal in https://github.com/datahub-project/datahub/pull/7582
- refactor(ingest): Use shared connection wrapper over connection cache by @asikowitz in https://github.com/datahub-project/datahub/pull/7570
- Extend character limit for Create Domain Modal by @gabe-lyons in https://github.com/datahub-project/datahub/pull/7589
- fix(smoke-test): always use built images in smoke tests by @hsheth2 in https://github.com/datahub-project/datahub/pull/7587
- feat(ingest/s3): support path_specs of different S3 buckets in the same recipe by @harsha-mandadi-4026 in https://github.com/datahub-project/datahub/pull/7514
- fix(ingest): pin
typeguard
version for feast by @hsheth2 in https://github.com/datahub-project/datahub/pull/7591 - chore(ci): update dependencies, fix smoke image build by @anshbansal in https://github.com/datahub-project/datahub/pull/7580
- chore(deps): bump @sideway/formula from 3.0.0 to 3.0.1 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/7554
- fix(ingest/powerbi): support each expression in m-query function invocation by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7541
- fix(ingestion): Readd batchDelayMs by @egemenberk in https://github.com/datahub-project/datahub/pull/7559
- chore(deps): bump @sideway/formula from 3.0.0 to 3.0.1 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/7553
- fix(docker): fix elasticsearch image tag by @david-leifker in https://github.com/datahub-project/datahub/pull/7548
- feat(docs): add docs on lineage by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7576
- refactor: misc fixes logging, annotations by @anshbansal in https://github.com/datahub-project/datahub/pull/7579
- fix(policies): add missing policies, add check to prevent problems by @anshbansal in https://github.com/datahub-project/datahub/pull/7586
- docs: misc fixes by @anshbansal in https://github.com/datahub-project/datahub/pull/7603
- feat: add docs on adding column/dataset description by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7597
- feat(cli): show image pull progress in quickstart by @hsheth2 in https://github.com/datahub-project/datahub/pull/7593
- fix(ingest/snowflake): Allow SnowflakeObjectAccessEntry.objectId to be None by @asikowitz in https://github.com/datahub-project/datahub/pull/7601
- docs(): Add View-related permissions to DataHub docs by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7600
- feat(ingest): add urn modification helper by @hsheth2 in https://github.com/datahub-project/datahub/pull/7440
- feat: add docs on creating tags/terms/datasets by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7608
- feat(metadata-io): add support in Neo4jGraphService for lineage time filter by @shidianshifen in https://github.com/datahub-project/datahub/pull/7375
- docs: add new code examples on creating entities & fix minor typos by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7613
- chore(ci): fix flakiness, misc improvements by @anshbansal in https://github.com/datahub-project/datahub/pull/7605
- feat(ingest/docs): json-schema fixes, improvements to ingestion doc generation by @shirshanka in https://github.com/datahub-project/datahub/pull/7615
- fix(docker): fix gradle quickstart version parsing by @hsheth2 in https://github.com/datahub-project/datahub/pull/7614
- fix(elasticsearch): make indexNameMapping in IndexConventionImpl threadsafe by @iprentic in https://github.com/datahub-project/datahub/pull/7565
- docs: add CLI installiation guide via poetry by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7619
- fix(ingest/docs): improve matcher to include types with spaces in them by @shirshanka in https://github.com/datahub-project/datahub/pull/7631
- docs: reformat use case guide toc & api comparison table by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7621
- docs: fix image in development by @jx2lee in https://github.com/datahub-project/datahub/pull/7637
- docs: fix typo and image by @yoonhyejin in https://github.com/datahub-project/datahub/pull/7635
- feat(ingestion): powerbi # Amazon Redshift lineage support by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7562
- fix(ingest/dbt): introduce lowercase column urn option by @alex-magno in https://github.com/datahub-project/datahub/pull/7418
- fix(smoke-test): fix native user and access token tests by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7628
- build(docker): metadata-ingestion images build and add slim version by @david-leifker in https://github.com/datahub-project/datahub/pull/7412
- fix(search): tags with colons exercises search with urns, must follow… by @david-leifker in https://github.com/datahub-project/datahub/pull/7602
- feat(ingest): add auto_materialize_referenced_tags helper by @hsheth2 in https://github.com/datahub-project/datahub/pull/7626
- fix(ingest): remove get_platform_instance_id from stateful ingestion by @hsheth2 in https://github.com/datahub-project/datahub/pull/7572
- fix(ingest/superset): support superset v2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/7588
- fix(entity registry): Fix patching aspects onto existing Config based entity by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7624
- fix(docker): fix image name for datahub-ingestion-slim by @shirshanka in https://github.com/datahub-project/datahub/pull/7653
- misc fixes by @david-leifker in https://github.com/datahub-project/datahub/pull/7633
- fix(impactAnalysis): fix filtering for lightning mode search (#1225) by @shirshanka in https://github.com/datahub-project/datahub/pull/7652
- fix(platform): Ensure time based lineage handles noop changes by @shirshanka in https://github.com/datahub-project/datahub/pull/7657
- refactor(ui): Fix scrolling behavior for compact entity profile by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7599
- feat(ingestion): powerbi # support platform instance by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7583
- feat(ingestion): powerbi # uniquly identify the multiple instance of same platform by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7632
- fix(datahub-upgrade) custom timeseries aspect index creation issue. by @siladitya2 in https://github.com/datahub-project/datahub/pull/7622
- fix(ui): Fix download to CSV flow using Scroll across entities api by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7629
- fix(search): missing model updates and tests by @david-leifker in https://github.com/datahub-project/datahub/pull/7617
- fix(revert): remove unnecessary class check by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7658
- lint(test): remove unused imports, other test fixes by @david-leifker in https://github.com/datahub-project/datahub/pull/7659
- refactor(ui): Loading schema dynamically for dataset profile by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7558
- fix(ui): Address regression in column usage stats + add unit test by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7645
- refactor(ui): Make Navigating DataHub UI easier, fix duplicate tracking, duplicate networks calls, + misc optimizations by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7592
- refactor(lineage): Refactor getAndUpdatePaths inside of ESGraphQueryDao by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7556
- feat(cli): build and upload Python wheels in CI by @hsheth2 in https://github.com/datahub-project/datahub/pull/7537
- fix(ingest/bigquery): Pass whether view is materialized; pass last_altered correctly by @asikowitz in https://github.com/datahub-project/datahub/pull/7660
- feat(docs-website): add vercel preview environment by @hsheth2 in https://github.com/datahub-project/datahub/pull/7644
New Contributors
- @jinlintt made their first contribution in https://github.com/datahub-project/datahub/pull/7303
- @asikowitz made their first contribution in https://github.com/datahub-project/datahub/pull/7317
- @shidianshifen made their first contribution in https://github.com/datahub-project/datahub/pull/7219
- @sandertan made their first contribution in https://github.com/datahub-project/datahub/pull/7344
- @amartinson193 made their first contribution in https://github.com/datahub-project/datahub/pull/7400
- @lutongzero made their first contribution in https://github.com/datahub-project/datahub/pull/7405
- @nachiket-juneja made their first contribution in https://github.com/datahub-project/datahub/pull/7410
- @k-popov made their first contribution in https://github.com/datahub-project/datahub/pull/7447
- @TonyOuyangGit made their first contribution in https://github.com/datahub-project/datahub/pull/7467
- @tmemenga made their first contribution in https://github.com/datahub-project/datahub/pull/7431
- @khgould made their first contribution in https://github.com/datahub-project/datahub/pull/7462
- @NotYuki made their first contribution in https://github.com/datahub-project/datahub/pull/7459
- @siladitya2 made their first contribution in https://github.com/datahub-project/datahub/pull/7463
- @iprentic made their first contribution in https://github.com/datahub-project/datahub/pull/7544
- @yoonhyejin made their first contribution in https://github.com/datahub-project/datahub/pull/7557
- @harsha-mandadi-4026 made their first contribution in https://github.com/datahub-project/datahub/pull/7514
- @egemenberk made their first contribution in https://github.com/datahub-project/datahub/pull/7559
- @alex-magno made their first contribution in https://github.com/datahub-project/datahub/pull/7418
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.10.0...v0.10.1
v0.10.0
Release Highlights
Potential Downtime
This release introduces substantial improvements to search functionality which require reindexing indices.
During the reindexing:
- a system-update job will set indices to read-only and create a backup/clone of each index
- new components will be prevented from start-up until the reindex completes
- Helm deployments will go into read-only mode and new ingestion runs will fail
This process can take anywhere from 5 minutes to multiple hours; as rough estimate, please expect it to take 1 hour for every 2.3 million entities. After the reindex is complete, please check your ingestion run to re-run any that did not complete.
User Experience
We have some really exciting improvements to the DataHub user experience in this release!
Improved documentation editor, contributed by @ngamanda and the Grab Team. This work provides a much more intuitive documentation editing experience within the UI, providing “what you see is what you get” formatting & removing the need for markdown expertise.
Additionally, you can easily:
- Add links to other entities/users within DataHub
- embed and resize tables & images
- toggle between font sizes and formats
- embed syntax-highlighted code blocks
Filter lineage graphs based on time windows You can now easily see the full lineage graph of an entity at a specific point in time. This makes it much easier to understand how interdependencies have evolved over time and to troubleshoot data issues in the past.
Improvements in Search As noted above, we have rolled out substantial improvements to Search functionality, making it easier than ever for end-user to find the entities that matter most. This release includes:
- Stemm & Synonyms
- Search by full or partial URN
- Autocomplete improvements
- Quoted search analyzer for exact & prefix match
Metadata Ingestion
Here are some of the most notable ingestion-related improvements:
- Redshift: You can now extract lineage information from unload queries – thanks for the contrib, @mmmeeedddsss
- PowerBI: Ingestion now maps Workspaces to DataHub Containers – thanks for the contrib, @looppi
- BigQuery: You can now extract lineage metadata from the Catalog API – thanks for the crontrib, @PatrickfBraz
- Glue: Ingestion now uses table name as the human-readable name – thanks for the contrib, @danielcmessias
Developer Experience
- This release introduces DataHub Lite - a new experimental lightweight implementation of DataHub. It is intended to enable local developer tooling use-cases such as simple access to metadata for scripts and other tools. DataHub Lite is compatible with the DataHub metadata format and all the ingestion connectors that DataHub supports. Checkout the docs here.
Breaking Changes
[#7103](https://github.com/datahub-project/datahub/pull/7103) This should only impact users who have configured explicit non-default names for DataHub's Kafka topics. The environment variables used to configure Kafka topics for DataHub used in the kafka-setup docker image have been updated to be in-line with other DataHub components, for more info see our docs on Configuring Kafka in DataHub . They have been suffixed with _TOPIC where as now the correct suffix is _TOPIC_NAME. This change should not affect any user who is using default Kafka names.
What's Changed
- fix(ci): only scan on master branch by @anshbansal in https://github.com/datahub-project/datahub/pull/7047
- fix(ci): use trivy offline scanning by @anshbansal in https://github.com/datahub-project/datahub/pull/7050
- docs(get-started) Simplify copy on Get Started landing page by @maggiehays in https://github.com/datahub-project/datahub/pull/7043
- fix(ingest/kafka): fix ResourceType import error for confluent_kafka<1.9.0 by @mayurinehate in https://github.com/datahub-project/datahub/pull/7046
- docs(dbt): fix indentation in dbt meta mapping docs by @jx2lee in https://github.com/datahub-project/datahub/pull/7045
- fix(ingest): temporarily disable vertica tests by @hsheth2 in https://github.com/datahub-project/datahub/pull/7059
- feat(editor): improve documentation editor using Remirror by @ngamanda in https://github.com/datahub-project/datahub/pull/6631
- fix(bootstrap): add EDIT_LINEAGE privilege to some default policies by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7060
- feat(ingest): add entity registry in codegen by @hsheth2 in https://github.com/datahub-project/datahub/pull/6984
- feat(ingest): extract powerbi endorsements to tags by @looppi in https://github.com/datahub-project/datahub/pull/6638
- feat(ingestion): pull metabase database, schema names from raw query and api by @remisalmon in https://github.com/datahub-project/datahub/pull/7039
- fix(ingest): support multiple entity_registry sections by @hsheth2 in https://github.com/datahub-project/datahub/pull/7066
- ci(ingest): add flag to skip tests but run codegen during release by @hsheth2 in https://github.com/datahub-project/datahub/pull/7067
- fix(ingest): preserve dbt column name casing by @hsheth2 in https://github.com/datahub-project/datahub/pull/7063
- fix(ingest/tableau): fix node limit exceeded error for workbooks query by @mayurinehate in https://github.com/datahub-project/datahub/pull/7068
- fix(build/airflow): Fixing gradlew path by @treff7es in https://github.com/datahub-project/datahub/pull/7069
- feat(ingest): support snapshots in dbt and dbt-cloud by @hsheth2 in https://github.com/datahub-project/datahub/pull/7062
- fix(ui) Fix duplicate schema field rendering with siblings by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7057
- refactor(ingest/athena): Replace
s3_staging_dir
parameter in Athena source withquery_result_location
by @bossenti in https://github.com/datahub-project/datahub/pull/7044 - feat(ingest): fix handling of unions with aliases in post restli conversion by @hsheth2 in https://github.com/datahub-project/datahub/pull/7058
- fix(ui) Make checkboxes in ingestion forms easier to see by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7061
- fix(ingest): support git clone of non-github repos by @hsheth2 in https://github.com/datahub-project/datahub/pull/7065
- feat(ingest): reporting revamp, part 1 by @hsheth2 in https://github.com/datahub-project/datahub/pull/7031
- fix(secret-service): fix default encrypt key by @david-leifker in https://github.com/datahub-project/datahub/pull/7074
- feat(datahub-lite): introduces a new experimental lightweight impleme… by @shirshanka in https://github.com/datahub-project/datahub/pull/7052
- feat(datahub-lite): adding tab completion, small serialization fixes by @shirshanka in https://github.com/datahub-project/datahub/pull/7079
- docs: add docs for managed DataHub v0.1.72 by @anshbansal in https://github.com/datahub-project/datahub/pull/7070
- docs(readme): add inovex as adopter by @DSchmidtDev in https://github.com/datahub-project/datahub/pull/7077
- docs: add warning about clearing cookies for login by @anshbansal in https://github.com/datahub-project/datahub/pull/7084
- feat(cache): add hazelcast distributed cache option by @RyanHolstien in https://github.com/datahub-project/datahub/pull/6645
- docs(datahub-lite): small improvement for zsh tab completion by @shirshanka in https://github.com/datahub-project/datahub/pull/7085
- fix(ingest/bigquery): clear stateful ingestion correctly by @hsheth2 in https://github.com/datahub-project/datahub/pull/7075
- fix(graphql): Return with appropriate status code instead of stacktrace by @szalai1 in https://github.com/datahub-project/datahub/pull/7086
- fix(sso): Clear cookies on SSO redirect error by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7088
- fix(docs): add missing mutation literal by @ruedigerblock in https://github.com/datahub-project/datahub/pull/7082
- fix(ui): display the correct access token expiry in AccessTokenModal by @ngamanda in https://github.com/datahub-project/datahub/pull/7078
- fix(cli/lite): fix datahub lite serve command by @hsheth2 in https://github.com/datahub-project/datahub/pull/7089
- fix(profiling): Fix syntax for APPROX_COUNT_DISTINCT on bigquery and snowflake by @feljen in https://github.com/datahub-project/datahub/pull/7087
- fix(ingest): fix logic error of google protobuf wrapper type. by @wngus606 in https://github.com/datahub-project/datahub/pull/7076
- feat(ui): Documentation Editor Improvements by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7072
- fix(uri): marks uri field as deprecated, removes problem code, and adds coercer for usages of URI typeref by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7093
- fix(build): postgres docker secret by @david-leifker in https://github.com/datahub-project/datahub/pull/7092
- fix(ingest/snowflake): handle corrupted snowflake OCSP cache file by @hsheth2 in https://github.com/datahub-project/datahub/pull/7095
- refactor(ingest): Refactoring container creation to common place by @treff7es in https://github.com/datahub-project/datahub/pull/6877
- feat(ingest): move datahub-lite to optional dep and add shim when missing by @hsheth2 in https://github.com/datahub-project/datahub/pull/7097
- fix(docker): support non amd64 dockerize in setup containers by @tonycsoka in https://github.com/datahub-project/datahub/pull/7091
- test(ingest): fix kafka admin client mocking by @hsheth2 in https://github.com/datahub-project/datahub/pull/7098
- fix(build): Fix postgres setup gha by @david-leifker in https://github.com/datahub-project/datahub/pull/7104
- fix(ingest/profile): properly quoting approx_count_distinct by @treff7es in https://github.com/datahub-project/datahub/pull/7101
- style(models): Replaces non-ASCII charactes in pdl files with ASCII c… by @nmbryant in https://github.com/datahub-project/datahub/pull/7105
- feat(ingest): hide cartesian product warnings in GE profiler by @hsheth2 in https://github.com/datahub-project/datahub/pull/7096
- feat(ingest): add removing partition pattern in spark lineage by @ssilb4 in https://github.com/datahub-project/datahub/pull/6605
- feat(redshift): Fetch lineage from unload queries by @mmmeeedddsss in https://github.com/datahub-project/datahub/pull/7041
- fix(ci): do not confirm on force for deletion by @anshbansal in https://github.com/datahub-project/datahub/pull/7106
- fix(analytics): add missing usage events causing warning in logs by @anshbansal in https://github.com/datahub-project/datahub/pull/7109
- feat(quickstart): Remove kafka-setup as a hard deployment requirement by @pedro93 in https://github.com/datahub-project/datahub/pull/7073
- fix(tests): Fixing add_users smoke test by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7116
- chore(deps): bump ua-parser-js from 0.7.32 to 0.7.33 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/7122
- docs(gms): clarify behavior of soft deletion in UI by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7117
- fix(kafka-setup): Make topic name consistent with other images by @pedro93 in https://github.com/datahub-project/datahub/pull/7103
- chore(deps): bump ua-parser-js from 0.7.32 to 0.7.33 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/7123
- feat(ingest): powerbi # add powerbi workspaces to containers by @looppi in https://github.com/datahub-project/datahub/pull/6532
- fix(diffMode): prevent misconfiguration of diff mode by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7127
- fix(ui) Display glossary term name in analytics page properly by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7128
- fix(ui): only use visible and enabled tabs for selected tab and routing in entity profiles by @Masterchen09 in https://github.com/datahub-project/datahub/pull/6629
- fix(htrace): remove htrace jar by @szalai1 in https://github.com/datahub-project/datahub/pull/7126
- feat(datahub-lite): simplify get response by @shirshanka in https://github.com/datahub-project/datahub/pull/7131
- fix(doc/biquery): Updating bigquery capability doc by @treff7es in https://github.com/datahub-project/datahub/pull/7136
- fix(ci): do not fail fast for matrix runs by @anshbansal in https://github.com/datahub-project/datahub/pull/7132
- refactor(ui): refactor capitalization of platform name and sub types by @Masterchen09 in https://github.com/datahub-project/datahub/pull/7099
- refactor(cli): extract method, change wording by @anshbansal in https://github.com/datahub-project/datahub/pull/7134
- docs(lineage): Updating Lineage feature guide by @maggiehays in https://github.com/datahub-project/datahub/pull/6257
- removing WIP by @laulpogan in https://github.com/datahub-project/datahub/pull/7140
- docs(oidc): Updating + improving docs around OIDC configuration by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7141
- fix(ingest): add message proto check by @tinolyu in https://github.com/datahub-project/datahub/pull/7130
- fix(ingest): use snowflake median function in profiling by @hsheth2 in https://github.com/datahub-project/datahub/pull/6987
- feat(ui): allow removing parentNodes of Glossary Nodes and Glossary Terms by @ngamanda in https://github.com/datahub-project/datahub/pull/7135
- feat(ui) Add new embedded profile to be displayed in extension by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7113
- feat(ingest): add
--log-file
option and show CLI logs in UI report by @hsheth2 in https://github.com/datahub-project/datahub/pull/7118 - fix(misc): NPE and GraphQL case fixes by @david-leifker in https://github.com/datahub-project/datahub/pull/7149
- fix(ingest/snowflake): fix regression in approx count distinct by @hsheth2 in https://github.com/datahub-project/datahub/pull/7146
- [docs] fix typo / add missing line for docker compose / attach overwriting system action config for confluent. by @kdongho in https://github.com/datahub-project/datahub/pull/7142
- reordering sidebar and adding homepage to apis by @laulpogan in https://github.com/datahub-project/datahub/pull/7139
- fix(ingestion): powerbi # Not all arguments converted to string by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7157
- fix(ui): Sort top users by their query count in datasets stats tab by @jaykadambi in https://github.com/datahub-project/datahub/pull/7148
- refactor(ui): Updates to Manual Lineage search by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7151
- feat(ui) Build entity doesn't exist page for entity profiles by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7150
- ci(ingest): fix broken CI workflow for metadata-ingestion by @hsheth2 in https://github.com/datahub-project/datahub/pull/7161
- fix(ingest): azuread group mapping do not stop ingestion by @anshbansal in https://github.com/datahub-project/datahub/pull/7169
- fix(docs): Fixes links to docs templates by @viniciusdsmello in https://github.com/datahub-project/datahub/pull/7171
- refactor(ui ingest): Allow enabling / disabling ingestion schedule easily by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7162
- fix(ingest): switch various sources to
auto_stale_entity_removal
helper by @hsheth2 in https://github.com/datahub-project/datahub/pull/7158 - docs(townhall) Update Townhall History doc by @maggiehays in https://github.com/datahub-project/datahub/pull/7180
- test(ingest/delta-lake): fix spurious directory creation by @hsheth2 in https://github.com/datahub-project/datahub/pull/7179
- feat: add a linter for github actions workflows by @hsheth2 in https://github.com/datahub-project/datahub/pull/7178
- fix(quickstart): adding back kafka-setup by @szalai1 in https://github.com/datahub-project/datahub/pull/7181
- fix(docs) Fix broken links in ingestion docs by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7183
- fix(ingest/GX): fix snowflake urn generated from connection string by @mayurinehate in https://github.com/datahub-project/datahub/pull/7173
- feat(ingest): switch dbt to use
auto_stale_entity_removal
by @hsheth2 in https://github.com/datahub-project/datahub/pull/7160 - fix(ingest): fix issue in glue tests by @hsheth2 in https://github.com/datahub-project/datahub/pull/7185
- fix(log): logging timestamp in ISO8601 format instead of time by @anshbansal in https://github.com/datahub-project/datahub/pull/7188
- feat(ingest): bigquery - extracts lineage metadata from catalog api by @PatrickfBraz in https://github.com/datahub-project/datahub/pull/7137
- fix(ingest/tableau): show warning about token expiry for PATs by @hsheth2 in https://github.com/datahub-project/datahub/pull/7187
- fix(ingest/vertica): Fixing missing container properties by @treff7es in https://github.com/datahub-project/datahub/pull/7197
- chore(deps): bump Netty from 4.1.85.Final to 4.1.86.Final by @janhicken in https://github.com/datahub-project/datahub/pull/7191
- docs(ingestion): powerbi # Add permission for DAX and mashup expressions by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7195
- feat(elasticsearch): Elasticsearch improvements by @david-leifker in https://github.com/datahub-project/datahub/pull/6894
- fix(test): spark-lineage # build task as dependency of integrationTest by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7189
- chore(sample): add status removed aspect for sample data by @anshbansal in https://github.com/datahub-project/datahub/pull/7203
- docs(managed datahub): release notes for v0.1.73 by @anshbansal in https://github.com/datahub-project/datahub/pull/7194
- fix(bootstrapdata): update timestamp to be in the last 1 year by @szalai1 in https://github.com/datahub-project/datahub/pull/7206
- fix(ingest/bigquery): quoting for APPROX_COUNT_DISTINCT in BigQuery by @mryorik in https://github.com/datahub-project/datahub/pull/7207
- fix(versioning): Ensure that CLI version is always dot-delimited even in minor release versions by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7200
- fix(test): missing variables in test causing error in logs by @anshbansal in https://github.com/datahub-project/datahub/pull/7210
- feat(mlModel): mark downstream jobs as ml model downstreams lineage by @mayurinehate in https://github.com/datahub-project/datahub/pull/7205
- ci(): fix datahub-upgrade quickstart regression by @hsheth2 in https://github.com/datahub-project/datahub/pull/7217
- feat(ingest): Add custom properties to the ldap ingestion by @bda618 in https://github.com/datahub-project/datahub/pull/7125
- fix(ingest): upgrade feast to avoid build issues by @hsheth2 in https://github.com/datahub-project/datahub/pull/7218
- fix(ui) Increase the number of assertions that we query for in tab by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7215
- fix(ci): trivy code scanning fix by @anshbansal in https://github.com/datahub-project/datahub/pull/7232
- feat(glue): Use table name as human-readable name for Glue ingestion by @danielcmessias in https://github.com/datahub-project/datahub/pull/7213
- feat(ui): Supporting display of columns and storage count in previews by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7198
- fix(gms): Fixes delete references for single relationship aspects by @pedro93 in https://github.com/datahub-project/datahub/pull/7211
- docs(ingest/lineage): clarify name field in entity config for file based lineage by @mayurinehate in https://github.com/datahub-project/datahub/pull/7225
- fix(ui): typo 'Documenataion' by @vojtechneradatos in https://github.com/datahub-project/datahub/pull/7227
- fix(cli/delete): skip references prompt if deleting an aspect by @hsheth2 in https://github.com/datahub-project/datahub/pull/7220
- fix(ingest/tableau): implement workbook_page_size parameter by @hsheth2 in https://github.com/datahub-project/datahub/pull/7216
- fix(gms): Corrects MCP generation in async mode by @pedro93 in https://github.com/datahub-project/datahub/pull/7214
- fix(ingest): redshift # build late binding view lineage when sql written in upper case by @looppi in https://github.com/datahub-project/datahub/pull/7223
- fix(siblings) Fix editing of schema fields for siblings with unequal schemas by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7199
- fix(ingest-idp): emit empty GroupMembership when there are no groups by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7196
- feat(lineage): add time filtering for lineage edges by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7159
- chore(deps): bump http-cache-semantics from 4.1.0 to 4.1.1 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/7230
- refactor(docs): Minor language updates for kafka source doc header by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7237
- docs(website): fix feature availability dark mode styles by @jeffmerrick in https://github.com/datahub-project/datahub/pull/7233
- chore(log/docs): improve error log, docs by @anshbansal in https://github.com/datahub-project/datahub/pull/7239
- fix(dev.sh): Add context to kafka-setup build by @szalai1 in https://github.com/datahub-project/datahub/pull/7234
- feat(cli): improve docker quickstart by @hsheth2 in https://github.com/datahub-project/datahub/pull/7184
- fix(elasticsearch): fix orphan index clean up pattern, consistent top… by @david-leifker in https://github.com/datahub-project/datahub/pull/7242
- chore(deps): bump http-cache-semantics from 4.1.0 to 4.1.1 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/7231
New Contributors
- @bossenti made their first contribution in https://github.com/datahub-project/datahub/pull/7044
- @ruedigerblock made their first contribution in https://github.com/datahub-project/datahub/pull/7082
- @feljen made their first contribution in https://github.com/datahub-project/datahub/pull/7087
- @tonycsoka made their first contribution in https://github.com/datahub-project/datahub/pull/7091
- @tinolyu made their first contribution in https://github.com/datahub-project/datahub/pull/7130
- @kdongho made their first contribution in https://github.com/datahub-project/datahub/pull/7142
- @jaykadambi made their first contribution in https://github.com/datahub-project/datahub/pull/7148
- @viniciusdsmello made their first contribution in https://github.com/datahub-project/datahub/pull/7171
- @mryorik made their first contribution in https://github.com/datahub-project/datahub/pull/7207
- @danielcmessias made their first contribution in https://github.com/datahub-project/datahub/pull/7213
- @vojtechneradatos made their first contribution in https://github.com/datahub-project/datahub/pull/7227
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.9.6...v0.9.7 v0.9.6.1
Release Highlights
Please disregard release v0.9.6 in favor of this release v0.9.6.1
Bug fix for secrets encryption
- Prevents decryption errors for existing secrets
- Affects reading ingestion secret created with a previous release
- Affects native user password validation
What's Changed
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.9.6...v0.9.6.1
v0.9.6 # Release Highlights
User Experience
We now support embedding Dashboards, Charts, and Datasets. This allows us to do things like directly embed Looker / Tableau / Mode / Redash Looks, Dashboards, Explores into the Dataset pages themselves.
[Experimental] You can now customize the number of queries displayed on the Query tab of a Dataset entity
Improved error messaging for bulk editing via the UI
Metadata Ingestion
Update to data profiling to allow configurable number of sample values to be returned Postgres ingestion now supports emitting lineage edges for Views - shoutout to @LucasRoesler for the contribution! Snowflake ingestion now supports extracting tags - shoutout to @frsann for the contribution! Vertica ingestion now supports projections and lineage- thanks for the contribution, @vishalkSimplify! Glue ingestion now emits an s3 lineage edge when data was written with an s3a/s3n client - thanks for the contribution, @danielli-ziprecruiter!
Developer Experience
Fixes quickstart/docker compose issues for M1 machines Improvements in reliability and performance of the Restli Service endpoints for ingestion: Scale Restli Service thread pool based on CPU Add retry (exp backoff) to Restli Entity Client MCE no longer relies on GMS for Restli service Converted Restli Service from standalone servlet to Spring injectable Docker build externalized (significantly faster on m1, <7 minute build times, based on this) Frontend asset generation refactor (causing tests to fail intermittently)
What's Changed
- feat(ingest): add pydantic helper for removed fields by @hsheth2 in https://github.com/datahub-project/datahub/pull/6853
- chore(0.9.5): Bump defaults for release v0.9.5 by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6856
- Revert "fix(ci): remove warnings due to deprecated action" by @anshbansal in https://github.com/datahub-project/datahub/pull/6857
- refactor(restli-mce-consumer) by @david-leifker in https://github.com/datahub-project/datahub/pull/6744
- fix(ci): reduce smoke test run time by @anshbansal in https://github.com/datahub-project/datahub/pull/6841
- fix(security): require signed/encrypted jwt tokens by @david-leifker in https://github.com/datahub-project/datahub/pull/6565
- feat(ingest): update profiling to fetch configurable number of sample values by @mayurinehate in https://github.com/datahub-project/datahub/pull/6859
- feat(ingest/airflow): support raw dataset urns in airflow lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/6854
- refactor(graphql): make graphqlengine easier to use by @anshbansal in https://github.com/datahub-project/datahub/pull/6865
- fix(kafka): datahub-upgrade job by @david-leifker in https://github.com/datahub-project/datahub/pull/6864
- feat(ingest): pass timeout config in kafka admin client api calls by @mayurinehate in https://github.com/datahub-project/datahub/pull/6863
- chore(ingest): loosen requirements file by @hsheth2 in https://github.com/datahub-project/datahub/pull/6867
- feat(ingest): upgrade pydantic version by @cccs-eric in https://github.com/datahub-project/datahub/pull/6858
- fix(elasticsearch): fixes out of order runId writes by @david-leifker in https://github.com/datahub-project/datahub/pull/6845
- chore(ingest): loosen additional requirements by @hsheth2 in https://github.com/datahub-project/datahub/pull/6868
- feat(ingest): bigquery/snowflake - Store last profile date in state by @treff7es in https://github.com/datahub-project/datahub/pull/6832
- docs(google-analytics): Correct grammatical error in README.md by @jx2lee in https://github.com/datahub-project/datahub/pull/6870
- feat(CI): add venv caching by @szalai1 in https://github.com/datahub-project/datahub/pull/6843
- feat(ingest/snowflake): handle failures gracefully and raise permission failures by @mayurinehate in https://github.com/datahub-project/datahub/pull/6748
- fix(runid): always update runid, except when queued by @david-leifker in https://github.com/datahub-project/datahub/pull/6876
- fix(ingest): conditionally include env in assertion guid by @hsheth2 in https://github.com/datahub-project/datahub/pull/6811
- chore(ci): update dependencies docs-website by @anshbansal in https://github.com/datahub-project/datahub/pull/6871
- feat(ui) - Add a custom error message for bulk edit to add clarity by @mkamalas in https://github.com/datahub-project/datahub/pull/6775
- docs(adding users): Refreshing the docs for adding new DataHub Users by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6879
- test(mce-consumer): mockbeans by @david-leifker in https://github.com/datahub-project/datahub/pull/6878
- feat(ingest): avoid embedding serialized json in metadata files by @hsheth2 in https://github.com/datahub-project/datahub/pull/6742
- refactor(gradle): move the local docker registry to common location by @david-leifker in https://github.com/datahub-project/datahub/pull/6881
- refactor(smoke): use env variables by @anshbansal in https://github.com/datahub-project/datahub/pull/6866
- fix(lint): pin pydantic version by @anshbansal in https://github.com/datahub-project/datahub/pull/6886
- refactor(docs): Correctly spell elasticsearch in docs by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6880
- fix(ingest): okta undefined variable error by @anshbansal in https://github.com/datahub-project/datahub/pull/6882
- fix(ci): reduce flakiness in add_users, siblings smoke test by @anshbansal in https://github.com/datahub-project/datahub/pull/6883
- fix(ingest): fall back to default table comment method for all Trino query errors by @marvin-roesch in https://github.com/datahub-project/datahub/pull/6873
- test(misc): misc test updates by @david-leifker in https://github.com/datahub-project/datahub/pull/6890
- deprecate(ingest): bigquery - Removing bigquery-legacy source by @treff7es in https://github.com/datahub-project/datahub/pull/6851
- chore(ingest): remove inferred args to MCPW, part 1 by @hsheth2 in https://github.com/datahub-project/datahub/pull/6819
- test(ingest/kafka-connect): make docker setup more reliable by @hsheth2 in https://github.com/datahub-project/datahub/pull/6902
- fix(ingest): profiling (bigquery) - Address biquery profiling query error due to timestamp vs data mismatch by @treff7es in https://github.com/datahub-project/datahub/pull/6874
- fix(cli): Make datahub quickstart work with latest docker compose in M1 by @pedro93 in https://github.com/datahub-project/datahub/pull/6891
- fix(cli): fix delete urn cli bug + stricter type annotations by @hsheth2 in https://github.com/datahub-project/datahub/pull/6903
- fix(ingest/airflow): reorder imports to avoid cyclical dependencies by @stijndehaes in https://github.com/datahub-project/datahub/pull/6719
- feat: remove jq requirement + tweak modeldocgen args by @hsheth2 in https://github.com/datahub-project/datahub/pull/6904
- chore(ingest): loosen pyspark and pydeequ deps by @hsheth2 in https://github.com/datahub-project/datahub/pull/6908
- docs(ingest/looker): fix typos + update lookml github action example by @hsheth2 in https://github.com/datahub-project/datahub/pull/6910
- fix(ingest/metabase): use card_id in dashboard to chart lineage by @ccpypy in https://github.com/datahub-project/datahub/pull/6583
- fix(es-setup): create data stream on non-aws by @szalai1 in https://github.com/datahub-project/datahub/pull/6926
- Adding missing Platform logos by @maggiehays in https://github.com/datahub-project/datahub/pull/6892
- feat(ingestion): PowerBI# Improve PowerBI source ingestion by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6549
- Fix compose context for kafka-setup by @szalai1 in https://github.com/datahub-project/datahub/pull/6923
- feat(backend): Supporting Embeddable Previews for Dashboards, Charts, Datasets by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6875
- chore(deps): bump json5 from 2.2.1 to 2.2.3 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/6930
- chore(deps): bump json5 from 1.0.1 to 1.0.2 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/6931
- fix(ci): managed ingestion test fix by @anshbansal in https://github.com/datahub-project/datahub/pull/6946
- feat(ingest): add
include_table_location_lineage
flag for SQL common by @hsheth2 in https://github.com/datahub-project/datahub/pull/6934 - feat(ingest): allow extracting snowflake tags by @frsann in https://github.com/datahub-project/datahub/pull/6500
- chore(ingest): unpin pydantic dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/6909
- chore(ingest): partially revert pyspark dep from #6908 by @hsheth2 in https://github.com/datahub-project/datahub/pull/6954
- fix(ingest): use branch info when cloning git repos by @hsheth2 in https://github.com/datahub-project/datahub/pull/6937
- chore(ingest): remove inferred args to MCPW, part 2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/6905
- fix(ingest/unity): simplify MCP generation and reporting by @hsheth2 in https://github.com/datahub-project/datahub/pull/6911
- chore(ci): parallelise build and test workflow to reduce time by @anshbansal in https://github.com/datahub-project/datahub/pull/6949
- fix(frontend): sasl.client.callback.handler.class by @szalai1 in https://github.com/datahub-project/datahub/pull/6962
- chore(react): remove outdated cypress tests and dependency by @anshbansal in https://github.com/datahub-project/datahub/pull/6948
- fix(ci): restrict GE to fix build issues by @anshbansal in https://github.com/datahub-project/datahub/pull/6967
- feat(queries): [Experimental] Allow customization of # of queries in Query tab via env var by @gabe-lyons in https://github.com/datahub-project/datahub/pull/6964
- feat(ingest/postgres): emit lineage for postgres views by @LucasRoesler in https://github.com/datahub-project/datahub/pull/6953
- feat(ingest/vertica): support projections and lineage in vertica by @vishalkSimplify in https://github.com/datahub-project/datahub/pull/6785
- fix(ingest): add missing dep for powerbi by @hsheth2 in https://github.com/datahub-project/datahub/pull/6969
- Docs fixes week of 12 22 by @laulpogan in https://github.com/datahub-project/datahub/pull/6963
- fix(ingest): unfreeze bigquery/snowflake column dataclass by @mayurinehate in https://github.com/datahub-project/datahub/pull/6921
- chore(frontend) Remove unused dependencies from package.json by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6974
- chore: misc fixes by @anshbansal in https://github.com/datahub-project/datahub/pull/6966
- feat(ingest/glue): emit s3 lineage for s3a and s3n schemes by @danielli-ziprecruiter in https://github.com/datahub-project/datahub/pull/6788
- fix(kafka-setup): Make kafka-setup run with multiple threads by @pedro93 in https://github.com/datahub-project/datahub/pull/6970
- feat(ingest): mark database_alias and env as deprecated by @hsheth2 in https://github.com/datahub-project/datahub/pull/6901
- fix(docs): Updating Tag, Glossary Term docs to point to correct GraphQL methods by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6965
- chore(deps): bump certifi from 2020.12.5 to 2022.12.7 in /metadata-ingestion/src/datahub/ingestion/source/feast_image by @dependabot in https://github.com/datahub-project/datahub/pull/6979
- fix(ingest): profiling - Fixing issue with the wrong timestamp stored in check by @treff7es in https://github.com/datahub-project/datahub/pull/6978
- config(quickstart): enable auto-reindex for quickstart by @david-leifker in https://github.com/datahub-project/datahub/pull/6983
- feat(privileges) - Create a privilege to manage glossary children recursively by @mkamalas in https://github.com/datahub-project/datahub/pull/6731
- chore(ingest): finish removing feast-legacy by @hsheth2 in https://github.com/datahub-project/datahub/pull/6985
- feat(ingest): add import descriptions of two or more nested messages by @wngus606 in https://github.com/datahub-project/datahub/pull/6959
- feat(docs) Add feature guide for Manual Lineage by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6933
- docs(rfc): Serialising GMS Updates with Preconditions by @mattmatravers in https://github.com/datahub-project/datahub/pull/5818
- fix(ingest/kafka-connect) support newer version of debezium by @jaegwonseo in https://github.com/datahub-project/datahub/pull/6943
- fix(docs): build and broken snowflake docs fix by @anshbansal in https://github.com/datahub-project/datahub/pull/6997
- fix(ingest): bigquery - views in case more than 1 datasets with views by @anshbansal in https://github.com/datahub-project/datahub/pull/6995
- fix(docs): Renaming Business Glossary Doc by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7001
- fix(ingest/snowflake): fix type annotations + refactor get_connect_args by @hsheth2 in https://github.com/datahub-project/datahub/pull/7004
- fix(docs): Changing the platform event topic name in kafka custom topic docs by @blankon123 in https://github.com/datahub-project/datahub/pull/7007
- fix(docs): fix name of privilege referenced in posts doc by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7002
- fix(SSO): Correctly redirect to originally requested URL in SSO by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7011
- fix(ingest): remove dead code from tests by @hsheth2 in https://github.com/datahub-project/datahub/pull/7005
- feat(ingestion): Tableau # Embed links by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6994
- feat(auth) Update auth cookies to have same-site none for chrome extension by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6976
- docs(website): DPG WIP by @maggiehays in https://github.com/datahub-project/datahub/pull/6998
- docs: resize datahub logo by @hsheth2 in https://github.com/datahub-project/datahub/pull/7014
- fix(kafka-setup): Remove reference to non-existing topic by @pedro93 in https://github.com/datahub-project/datahub/pull/7019
- fix(ingest): powerbi # use display name field as title for powerbi report page by @looppi in https://github.com/datahub-project/datahub/pull/7017
- feat(auth) Allow session ttl to be configurable by env variable by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7022
- fix(ui): URL Encode all Entity Profile URLs by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7023
- fix(ui ingest): Fix test connection when stateful ingest is enabled by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7013
- docs(sso) move root user warning to earlier in SSO guides by @maggiehays in https://github.com/datahub-project/datahub/pull/7028
- fix(ingest/looker): add clarity in chart input parsing logs by @hsheth2 in https://github.com/datahub-project/datahub/pull/7003
- chore(ingest): remove duplicate data_platform.json file by @hsheth2 in https://github.com/datahub-project/datahub/pull/7026
- feat(ingestion): PowerBI # Remove corpUserInfo aspect ingestion by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7034
- fix(metadata-models): remove unnecessary bin folder by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7035
- fixing typos by @maggiehays in https://github.com/datahub-project/datahub/pull/7030
New Contributors
- @marvin-roesch made their first contribution in https://github.com/datahub-project/datahub/pull/6873
- @stijndehaes made their first contribution in https://github.com/datahub-project/datahub/pull/6719
- @ccpypy made their first contribution in https://github.com/datahub-project/datahub/pull/6583
- @LucasRoesler made their first contribution in https://github.com/datahub-project/datahub/pull/6953
- @vishalkSimplify made their first contribution in https://github.com/datahub-project/datahub/pull/6785
- @wngus606 made their first contribution in https://github.com/datahub-project/datahub/pull/6959
- @jaegwonseo made their first contribution in https://github.com/datahub-project/datahub/pull/6943
- @blankon123 made their first contribution in https://github.com/datahub-project/datahub/pull/7007
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.9.5...v0.9.6
v0.9.4 # Release Highlights
KNOWN ISSUES
There is a known issue with OIDC which we will address in a fast-follow release. If you use OIDC, please wait for v0.9.5 to upgrade.
User Experience
Manual Lineage is LIVE! You can now add and remove lineage between entities in the Lineage Visualization screen, making it easier than ever to manage the complex relationships between your data resources.
Our new Views feature makes it easy to create curated sets of Entities within DataHub. This is a great way to start to isolate the entities that matter most, and provide your DataHub end-users with a streamlined view of the assets that are relevant to their use cases.
In-App Product Tours are here! When logging into DataHub and/or visiting a new page type for the first time, new users will be prompted with a helpful walkthrough of core functionality to get them familiar with the platform. We’ll continue to add modules as we roll out new features!
Automatically send updates to Slack and/or Microsoft Teams when changes are made within DataHub by leveraging our the new Slack and Teams Actions
Metadata Ingestion
We’re continuing to improve the user experience for UI-based ingestion for the following sources: dbt Cloud DataBricks Unity Catalog MySQL Trino/Preso MSSQL MariaDB If you’re just getting started with UI-based Ingestion, check out our new BigQuery & Snowflake guides Stateful ingestion is now supported for Iceberg (thanks for the contrib, @cccs-Dustin!) and LDAP (thanks for the contrib, @bda618!) Speaking of Stateful Ingestion, we’re taking some steps to simplify the code behind Sta
What's Changed
- chore(): Updating default CLI version, update updating-datahub.md by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6590
- fix(ingest): profiling - Profiling failed if column cardinality threw an error by @treff7es in https://github.com/datahub-project/datahub/pull/6582
- fix(actions): add missing datahub-gms-protocol env var by @shirshanka in https://github.com/datahub-project/datahub/pull/6593
- fix(ingest): restrict snowflake-connector-python dependency by @mayurinehate in https://github.com/datahub-project/datahub/pull/6594
- feat(ingest/bigquery): avoid creating/deleting tables for profiling by @hsheth2 in https://github.com/datahub-project/datahub/pull/6578
- fix(ingest): unify emit interface by @hsheth2 in https://github.com/datahub-project/datahub/pull/6592
- fix(security): security version updates by @david-leifker in https://github.com/datahub-project/datahub/pull/6602
- docs: remove Kafka Streams from documentation by @maver1ck in https://github.com/datahub-project/datahub/pull/6596
- refactor(ui): Improving Kafka UI Ingestion Form, Create Domain, Create Secret Modals by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6588
- fix(ingest): clarify tableau auth error messages by @hsheth2 in https://github.com/datahub-project/datahub/pull/6600
- docs(graphql): fix deleteTest "Create"->"Delete" by @nickwu241 in https://github.com/datahub-project/datahub/pull/6574
- fix(gms/startup): remove set -x from start.sh by @timcosta in https://github.com/datahub-project/datahub/pull/6589
- feat(sql): Add SQL index on createdon field by @pedro93 in https://github.com/datahub-project/datahub/pull/6522
- feat(ml model): updating view of ml model feature list by @gabe-lyons in https://github.com/datahub-project/datahub/pull/6576
- fix(ingest/bigquery): ignore complex types from profiling by @treff7es in https://github.com/datahub-project/datahub/pull/6613
- feat(ingest): add external url for snowflake objects by @mayurinehate in https://github.com/datahub-project/datahub/pull/6580
- chore(ingest): bump and pin mypy by @hsheth2 in https://github.com/datahub-project/datahub/pull/6584
- fix(ingest): only require github_info for lookml and not looker by @hsheth2 in https://github.com/datahub-project/datahub/pull/6608
- docs(ingest): add airflow docs that use the
PythonVirtualenvOperator
by @hsheth2 in https://github.com/datahub-project/datahub/pull/6604 - fix(ui) Fix double scroll in embedded list search sections by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6618
- feat(ingest): print detailed GMS error messages by @djordje-mijatovic in https://github.com/datahub-project/datahub/pull/6519
- Townhall agenda wikimedia by @maggiehays in https://github.com/datahub-project/datahub/pull/6622
- fix(analytics): skip ListDomains if user cannot manage domains and have only one loading message by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6624
- feat(quickstart): add support for passing thru env vars needed by Sla… by @shirshanka in https://github.com/datahub-project/datahub/pull/6591
- docs(actions): slack, teams by @shirshanka in https://github.com/datahub-project/datahub/pull/6632
- fix(logging): Remove lombok as source of slf4j-api by @david-leifker in https://github.com/datahub-project/datahub/pull/6616
- docs: add links from main README to slack, teams actions by @shirshanka in https://github.com/datahub-project/datahub/pull/6633
- feat(ingest): Support config variable for specifying a direct privat… by @mayurinehate in https://github.com/datahub-project/datahub/pull/6609
- Add AWS Postgres Iam Auth jar to GMS by @syedzoherer in https://github.com/datahub-project/datahub/pull/6371
- feat(ingest/snowflake): support filtering by fully qualified schema_pattern by @mayurinehate in https://github.com/datahub-project/datahub/pull/6611
- feat(ingest/kafka-connect): support MongoSourceConnector by @frsann in https://github.com/datahub-project/datahub/pull/6416
- feat(graph) Add createdOn, createdActor, updatedOn, updatedActor to graph edges by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6615
- refactor(ui): Making improvements to UI ingestion forms, adding MySQL, Trino, Presto, MSSQL, MariaDB forms by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6607
- perf(ui-ingestion): cache on creation or deletion of ingestion sources to reduce latency by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6647
- feat(ingest): add dummy data source for automated testing by @anshbansal in https://github.com/datahub-project/datahub/pull/6550
- docs(managed datahub): adding release notes for v0.1.70 by @anshbansal in https://github.com/datahub-project/datahub/pull/6655
- feat(gms): Pluggable Authentication & Authorization Framework by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6634
- docs: move rfcs to separate repo by @laulpogan in https://github.com/datahub-project/datahub/pull/6621
- fix(ingest): fix lingering demo-data source issues by @hsheth2 in https://github.com/datahub-project/datahub/pull/6659
- feat(ingest): bigquery - Running lineage extraction after metadata extraction by @treff7es in https://github.com/datahub-project/datahub/pull/6653
- fix(ingest): issue deprecation warning correctly by @hsheth2 in https://github.com/datahub-project/datahub/pull/6623
- chore(ingest): remove feast-legacy by @hsheth2 in https://github.com/datahub-project/datahub/pull/6661
- fix(ingest/snowflake): support domains for snowflake schema containers by @hsheth2 in https://github.com/datahub-project/datahub/pull/6662
- build(deps): bump decode-uri-component from 0.2.0 to 0.2.2 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/6617
- feat(ingest/dbt): add support for latest DBT version 1.3 by @MatthieuBlais in https://github.com/datahub-project/datahub/pull/6651
- docs: add languages to code highlighting by @hsheth2 in https://github.com/datahub-project/datahub/pull/5576
- docs(typo) Correct typo in domains.md by @maggiehays in https://github.com/datahub-project/datahub/pull/6667
- feat(gms): Enable auth-api publishing to maven by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6671
- fix(ingest/powerbi-report-server): deprecate unused graphql config by @daha in https://github.com/datahub-project/datahub/pull/6630
- fix(docker): Fix datahub-frontend dockerfile by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6670
- fix(ingest): profiling - Changing profiling defaults by @treff7es in https://github.com/datahub-project/datahub/pull/6640
- feat(ci): add smoke test for domain mutation by @anshbansal in https://github.com/datahub-project/datahub/pull/6641
- fix(datahub-protobuf): fix missing httpclient dependency by @shirshanka in https://github.com/datahub-project/datahub/pull/6672
- feat(ingest): update snowflake docs, add simple validations by @mayurinehate in https://github.com/datahub-project/datahub/pull/6636
- fix(gms): DataHub Auth API java doc fix by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6674
- feat(ingest): run profiler in more cardinality cases by @hsheth2 in https://github.com/datahub-project/datahub/pull/6397
- docs(search) update broken youtube link by @maggiehays in https://github.com/datahub-project/datahub/pull/6678
- docs(protobuf): update examples for protobuf by @david-leifker in https://github.com/datahub-project/datahub/pull/6681
- feat(ingest): support knowledge links in business glossary by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6375
- fix(ingestion/vertica): support columns with timestamp precision by @inancdokurel in https://github.com/datahub-project/datahub/pull/6295
- feat(ingest): add timestamps for snowflake objects by @mayurinehate in https://github.com/datahub-project/datahub/pull/6570
- feat(onboarding): adds framework and some steps for onboarding steps UI by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6462
- feat(ingest): use entry point for registering transformers by @Masterchen09 in https://github.com/datahub-project/datahub/pull/6628
- chore(ci): update base ingestion image requirements file by @anshbansal in https://github.com/datahub-project/datahub/pull/6687
- fix(ci): reduce warnings due to deprecated action by @anshbansal in https://github.com/datahub-project/datahub/pull/6686
- refactor(ui): Adding caching for users, groups, and roles by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6673
- fix(ci): revert confluent kafka in base image by @anshbansal in https://github.com/datahub-project/datahub/pull/6690
- fix(security): version bump to latest minor python image by @david-leifker in https://github.com/datahub-project/datahub/pull/6694
- docs(ingest/salesforce): list required permissions by @orlandine in https://github.com/datahub-project/datahub/pull/6610
- feat(ingest): bigquery - option to set on behalf project by @treff7es in https://github.com/datahub-project/datahub/pull/6660
- ci: stop commenting test results on PR by @hsheth2 in https://github.com/datahub-project/datahub/pull/6700
- fix(auth-api): Attempting to fix publish for auth-api by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6695
- build(deps): bump qs from 6.5.2 to 6.5.3 in /smoke-test/tests/cypress by @dependabot in https://github.com/datahub-project/datahub/pull/6663
- build(deps): bump express from 4.17.1 to 4.18.2 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/6665
- fix(ingest/tableau): support ssl_verify flag properly by @hsheth2 in https://github.com/datahub-project/datahub/pull/6682
- fix(config): unify the handling of boolean environment variables by @Masterchen09 in https://github.com/datahub-project/datahub/pull/6684
- fix(ui): fix search on policy builder by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6703
- build(deps): bump qs from 6.5.2 to 6.5.3 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/6664
- fix(ingest): cleanup config extra usage by @hsheth2 in https://github.com/datahub-project/datahub/pull/6699
- docs(logos): add Great Expectations logo by @maggiehays in https://github.com/datahub-project/datahub/pull/6698
- fix(security): play framework upgrade by @david-leifker in https://github.com/datahub-project/datahub/pull/6626
- fix(ingest/sagemaker): handle missing ProcessingInputs field by @hsheth2 in https://github.com/datahub-project/datahub/pull/6697
- build: add retries to gradle wrapper download in ingestion docker by @hsheth2 in https://github.com/datahub-project/datahub/pull/6704
- test(quickstart): add debugging to quickstart test by @david-leifker in https://github.com/datahub-project/datahub/pull/6718
- fix(setup): Bump setup images to alpine 3.14 with arch based on machine OS. by @pedro93 in https://github.com/datahub-project/datahub/pull/6612
- fix(ingest): fix bug in auto_status_aspect by @hsheth2 in https://github.com/datahub-project/datahub/pull/6705
- fix(security): commons-text, hadoop-commons versions by @david-leifker in https://github.com/datahub-project/datahub/pull/6723
- fix(build): rename conflicting module
auth-api
by @david-leifker in https://github.com/datahub-project/datahub/pull/6728 - docs(aws): edit markdown link by @jx2lee in https://github.com/datahub-project/datahub/pull/6706
- fix(ingest): fix mysql ingestion issue with non-lowercase database by @mayurinehate in https://github.com/datahub-project/datahub/pull/6713
- feat(ingest): redact configs reported in ingestion_run_summary by @hsheth2 in https://github.com/datahub-project/datahub/pull/6696
- fix(ingest): rectify filter for BigQuery external tables by @janhicken in https://github.com/datahub-project/datahub/pull/6691
- feat(ingest): add separate config for include_column_lineage in snowf… by @mayurinehate in https://github.com/datahub-project/datahub/pull/6712
- fix(ci): flakiness due to onboarding tour in add user test by @anshbansal in https://github.com/datahub-project/datahub/pull/6734
- feat(ui): Support DataBricks Unity Catalog Source in Ui Ingestion by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6707
- feat(ingest/iceberg): add stateful ingestion by @cccs-Dustin in https://github.com/datahub-project/datahub/pull/6344
- doc(restore): document restore indices API endpoint by @anshbansal in https://github.com/datahub-project/datahub/pull/6737
- feat(): Views Feature Milestone 1 by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6666
- feat(ingest): bigquery - external url support and a small profiling filter fix by @treff7es in https://github.com/datahub-project/datahub/pull/6714
- test(ingest): make hive/trino test more reliable by @hsheth2 in https://github.com/datahub-project/datahub/pull/6741
- Initial commit for bigquery ingestion guide by @treff7es in https://github.com/datahub-project/datahub/pull/6587
- fix(ci): remove warnings due to deprecated action by @anshbansal in https://github.com/datahub-project/datahub/pull/6735
- feat(ingest): add stateful ingestion to the ldap source by @bda618 in https://github.com/datahub-project/datahub/pull/6127
- fix(ingest): fix codegen
from_obj
for empty dicts in unions with null by @hsheth2 in https://github.com/datahub-project/datahub/pull/6745 - feat(ingest): start simplifying stateful ingestion state by @hsheth2 in https://github.com/datahub-project/datahub/pull/6740
- docs(gms): plugins# auth-api as compileOnly dependency by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6747
- fix(elasticsearch): build in resilience against IO exceptions on httpclient by @RyanHolstien in https://github.com/datahub-project/datahub/pull/6680
- ci: fix ingestion gradle retry by @hsheth2 in https://github.com/datahub-project/datahub/pull/6752
- fix(ingest): support airflow mapped operators by @cccs-seb in https://github.com/datahub-project/datahub/pull/6738
- fix(actions): fix mistype slack/teams base url by @ssilb4 in https://github.com/datahub-project/datahub/pull/6754
- fix(smoke-test): fix stateful ingestion test regression by @hsheth2 in https://github.com/datahub-project/datahub/pull/6753
- fix(auth): Renames metadata-auth archive name to not conflict with other modules. by @pedro93 in https://github.com/datahub-project/datahub/pull/6749
- fix(ingest/lookml): fix directory handling and a config validation bug by @hsheth2 in https://github.com/datahub-project/datahub/pull/6751
- refactor(ingest): bigquery-lineage - allow tables and datasets in uppercase by @PatrickfBraz in https://github.com/datahub-project/datahub/pull/6739
- refactor(ux): Misc UX Improvements (tutorial copy, caching, filters) by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6743
- Added build failed yarn error by @jakobhanna in https://github.com/datahub-project/datahub/pull/6757
- feat(ingest): remove source config from DatahubIngestionCheckpoint by @hsheth2 in https://github.com/datahub-project/datahub/pull/6722
- fix(python-sdk): DataHubGraph get_aspect should accept empty responses by @shirshanka in https://github.com/datahub-project/datahub/pull/6760
- fix(datahub-web-react): Properly escape a quote in React by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6764
- docs(ingest/airflow): clarify Airflow 1.x docs for airflow plugin by @hsheth2 in https://github.com/datahub-project/datahub/pull/6761
- feat(ingest): simplify more stateful ingestion state by @hsheth2 in https://github.com/datahub-project/datahub/pull/6762
- fix(ingest): bigquery - handling custom sql errors as warning in profiling by @treff7es in https://github.com/datahub-project/datahub/pull/6777
- docs(docker): add section for adding community images by @anshbansal in https://github.com/datahub-project/datahub/pull/6770
- docs(ingest): fix error in custom tags transformer example by @hsheth2 in https://github.com/datahub-project/datahub/pull/6767
- feat(ingest): add
datahub state inspect
command by @hsheth2 in https://github.com/datahub-project/datahub/pull/6763 - refactor(ui): Caching Ingestion Secrets by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6772
- docs(snowflake) Snowflake quick ingestion guide by @maggiehays in https://github.com/datahub-project/datahub/pull/6750
- Optimize kafka setup by @david-leifker in https://github.com/datahub-project/datahub/pull/6778
- feat(ingest/lookml): add unreachable views to report by @hsheth2 in https://github.com/datahub-project/datahub/pull/6779
- feat(ci): adding github security reporting to trivy scans by @shirshanka in https://github.com/datahub-project/datahub/pull/6773
- fix(smoke-test): remove stateful ingestion config check by @hsheth2 in https://github.com/datahub-project/datahub/pull/6781
- fix(ingest): correct external url for account identifier with account name by @mayurinehate in https://github.com/datahub-project/datahub/pull/6715
- fix(tutorial): skip getting steps if there is no user by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6786
- fix(kafka-setup): fix return code check by @david-leifker in https://github.com/datahub-project/datahub/pull/6782
- refactor(ui): Make include_tables and include_views default to True. Improve Tableau default recipe. by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6790
- fix(ingest): prevent NullPointerException when non-jdbc SaveIntoDataS… by @danielli-ziprecruiter in https://github.com/datahub-project/datahub/pull/6803
- docs(architecture): edit documents in architecture section by @jx2lee in https://github.com/datahub-project/datahub/pull/6798
- fix(ingest/dbt): remove unsupported usage indicator by @hsheth2 in https://github.com/datahub-project/datahub/pull/6805
- refactor(ui): Adding frontend caching + some misc. refactoring by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6796
- fix(ingest): bigquery - sharded table support improvements by @treff7es in https://github.com/datahub-project/datahub/pull/6789
- chore(ingest): pin black version by @hsheth2 in https://github.com/datahub-project/datahub/pull/6807
- refactor(ingest/stateful): remove most remaining state classes by @hsheth2 in https://github.com/datahub-project/datahub/pull/6791
- fix(profile): bigquery-legacy - Fix for TypeError-related failures in legacy plugin by @senapatim in https://github.com/datahub-project/datahub/pull/6806
- Update Grafana Dashboard by @NavinSharma13 in https://github.com/datahub-project/datahub/pull/6076
- refactor(ingest/stateful): remove
IngestionJobStateProvider
by @hsheth2 in https://github.com/datahub-project/datahub/pull/6792 - chore(ingest): bump python package dependencies to resolve vulns by @cyberay01 in https://github.com/datahub-project/datahub/pull/6384
- refactor(ingest/stateful): remove
get_last_state
method by @hsheth2 in https://github.com/datahub-project/datahub/pull/6794 - fix(ui): URL encode urns for ownership entity links by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6814
- fix(posts): add deletePost GraphQL endpoint by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6813
- fix(policies): resolve the associated domain for a domain as the domain itself by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6812
- feat(lineage) Adds ability to edit lineage manually from the UI by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6816
- fix(ui): change caching to happen post server-response when creating a UI ingestion recipe by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6815
- feat(ingest/stateful): remove platform_instance_id from state urn by @hsheth2 in https://github.com/datahub-project/datahub/pull/6795
- feat(ui): Adding DBT Cloud support for UI ingestion by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6804
- feat(kafka): expose default kafka producer mechanism by @djordje-mijatovic in https://github.com/datahub-project/datahub/pull/6381
New Contributors
- @maver1ck made their first contribution in https://github.com/datahub-project/datahub/pull/6596
- @MatthieuBlais made their first contribution in https://github.com/datahub-project/datahub/pull/6651
- @inancdokurel made their first contribution in https://github.com/datahub-project/datahub/pull/6295
- @orlandine made their first contribution in https://github.com/datahub-project/datahub/pull/6610
- @janhicken made their first contribution in https://github.com/datahub-project/datahub/pull/6691
- @cccs-Dustin made their first contribution in https://github.com/datahub-project/datahub/pull/6344
- @cccs-seb made their first contribution in https://github.com/datahub-project/datahub/pull/6738
- @ssilb4 made their first contribution in https://github.com/datahub-project/datahub/pull/6754
- @senapatim made their first contribution in https://github.com/datahub-project/datahub/pull/6806
- @cyberay01 made their first contribution in https://github.com/datahub-project/datahub/pull/6384
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.9.3...v0.9.4
V0.9.3 # Release Highlights
User Experience
Column Level Lineage Impact Analysis is live! Read more about it here You can now sort Dataset field names alphabetically - this is super handy for finding columns within wide datasets that may not have an easy-to-follow order by default [gif] Miscellaneous UX improvements: “Explore All” button on home page, making it easier to jump into the search experience [gif] “Share” button on entity pages [screenshot][Community Contribution] You can now assign the same user as different owner types - thanks for the contrib, @rtekal!
Metadata Ingestion
Snowflake Automated PII Classification is here! We’re eager for feedback on the utility of this feature - check out this guide, take it for a spin, and let us know what you think!
We’ve simplified the configs required to add stateful ingestion to an ingestion source - check out the updated docs here
Speaking of stateful ingestion, it’s now supported with:
Looker & LookML ingestion sources
[Community Contribution] Container-level ingestion – thanks for the contrib, @wangsaisai!
Developer Experience
NEW! dbt Cloud ingestion is ready for ya - check out the module details here [Community Contribution] For those of you deploying DataHub with Neo4j, we now support Lineage Impact analysis via Neoj4 mulithop functionality. Thanks for the contrib, @djordje-mijatovic! We’ve loosened our SQLAlchemy dependencies to support Airflow 2.3+
What's Changed
- fix(spark-lineage): Smoke test fix + smoke test m1 support by @treff7es in https://github.com/datahub-project/datahub/pull/6372
- feat(ingest): supports MCEs in domain transformer by @hsheth2 in https://github.com/datahub-project/datahub/pull/6364
- feat(ingest): enable container stateful ingestion by @wangsaisai in https://github.com/datahub-project/datahub/pull/6343
- build(ingest): pin mypy version by @hsheth2 in https://github.com/datahub-project/datahub/pull/6391
- build: use acryl's gradle-avro-plugin by @hsheth2 in https://github.com/datahub-project/datahub/pull/6390
- fix(ingest): unity - add missing date type by @ms32035 in https://github.com/datahub-project/datahub/pull/6385
- fix(ingest): unity-catalog - Removing unneeded sqlalchemy dependency to fix install by @treff7es in https://github.com/datahub-project/datahub/pull/6379
- feat(ingest/tableau): re-authenticate if the token expires by @hsheth2 in https://github.com/datahub-project/datahub/pull/6380
- fix(ingest): use profiler config settings correctly by @hsheth2 in https://github.com/datahub-project/datahub/pull/6354
- fix(ingest): handle error when query returns no columns in snowflake lineage by @mayurinehate in https://github.com/datahub-project/datahub/pull/6404
- fix(ingest): fix missing snowflake lineage when table_pattern is set by @mayurinehate in https://github.com/datahub-project/datahub/pull/6410
- feat(ingest): loosen sqlalchemy dep & support airflow 2.3+ by @hsheth2 in https://github.com/datahub-project/datahub/pull/6204
- fix(ingest/s3): add status aspect for detected s3 datasets by @mayurinehate in https://github.com/datahub-project/datahub/pull/6402
- fix(ingest/snowflake): loosen snowflake connector version requirement by @hsheth2 in https://github.com/datahub-project/datahub/pull/6418
- fix(mysql): fix native data type for mysql set type by @mayurinehate in https://github.com/datahub-project/datahub/pull/6407
- perf(ui): virtualized schema table rows by @stanbaker in https://github.com/datahub-project/datahub/pull/6287
- fix(ui) Improve HoverEntityTooltip and truncate parent glossary nodes by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6417
- feat(ingest): support incremental lineage to dbt node from external platform by @mayurinehate in https://github.com/datahub-project/datahub/pull/6392
- fix(ingest): init dataset props if missing in transformer by @hsheth2 in https://github.com/datahub-project/datahub/pull/6429
- fix(change-event): remove unnecessary dependencies on EntityChangeEventGeneratorRegistryFactory by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6431
- build(deps): bump moment-timezone from 0.5.34 to 0.5.35 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/5783
- feat(frontend): Adding support to show externalUrl and institutionalMemoryFields for MLModels by @lurecas in https://github.com/datahub-project/datahub/pull/6053
- feat(model): adds properties, ownership, deprecated, institutional memory and tags as aspects for data platform instance entity by @sgomezvillamor in https://github.com/datahub-project/datahub/pull/5728
- docs(ingest/airflow): clarify docs around 1.x compat by @hsheth2 in https://github.com/datahub-project/datahub/pull/6436
- feat(recommendations): add last edited entities by @CorentinDuhamel in https://github.com/datahub-project/datahub/pull/6329
- fix(ingest): correctly compute entity change percentage by @hsheth2 in https://github.com/datahub-project/datahub/pull/6438
- docs(townhall) Updating Townhall History by @maggiehays in https://github.com/datahub-project/datahub/pull/6336
- Neo4j multihop support by @djordje-mijatovic in https://github.com/datahub-project/datahub/pull/6104
- fix(mae-consumer): Set proper variable expansion for JMX_OPTS and JAVA_OPTS in MAE docker by @skrydal in https://github.com/datahub-project/datahub/pull/6378
- docs(ingest): move prerequisite section before the ingestion recipe example by @mayurinehate in https://github.com/datahub-project/datahub/pull/6341
- fix(dataset): improve glossary term load performance for datasets by @Reilman79 in https://github.com/datahub-project/datahub/pull/6396
- feat(lineage) Implement CLL impact analysis for inputFields by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6426
- feat(ui) Add upgrade step to enable CLL impact analysis for existing data by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6427
- Added functionality to copy fieldpath and urn of each column by @Ankit-Keshari-Vituity in https://github.com/datahub-project/datahub/pull/6398
- fix(ingestion): add output converters for ODBC unsuported datatype in… by @LavinaVRovine in https://github.com/datahub-project/datahub/pull/6134
- fix(ui) Fix parentNodes overfetching everywhere it's used by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6446
- fix(ingest): snowflake - Fixing top query trimming in snowflake by @treff7es in https://github.com/datahub-project/datahub/pull/6447
- feat(elasticsearch): Updates to elasticsearch configuration, dao, tests by @david-leifker in https://github.com/datahub-project/datahub/pull/6269
- chore(ingest): fix mssql lint by @hsheth2 in https://github.com/datahub-project/datahub/pull/6453
- fix(ingest): add cli info to ingestion reporter by @hsheth2 in https://github.com/datahub-project/datahub/pull/6451
- fix(ui) Fix glossary side browser width fluctuating by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6457
- fix(python): Fix python dependencies for doc generation by @david-leifker in https://github.com/datahub-project/datahub/pull/6460
- docs(website): add homepage links by @jeffmerrick in https://github.com/datahub-project/datahub/pull/6458
- build(ingest): loosen jinja2 dependency for superset by @KulykDmytro in https://github.com/datahub-project/datahub/pull/6433
- fix(ingest): lowercase db name in mssql ingestion by @hsheth2 in https://github.com/datahub-project/datahub/pull/6448
- fix(ingest): handle missing schema in transformer by @hsheth2 in https://github.com/datahub-project/datahub/pull/6445
- feat(ingest): allow specific profiler config fields to override profile_table_level_only by @hsheth2 in https://github.com/datahub-project/datahub/pull/6366
- docs(enrichment) updating enrichment landing page by @maggiehays in https://github.com/datahub-project/datahub/pull/6286
- fix(home-page): remove redundant getAuthenticatedUser query by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6464
- feat(ingest): detect old or missing docker compose by @hsheth2 in https://github.com/datahub-project/datahub/pull/6466
- feat(ingestion): powerbi # Power BI report support by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6339
- fix(ingest/dbt): disable incremental lineage by default by @hsheth2 in https://github.com/datahub-project/datahub/pull/6467
- fix(loggin): print logging timestamp in ISO8601 format instead of jus… by @szalai1 in https://github.com/datahub-project/datahub/pull/6474
- docs(ingest/trino): add example of http connection by @hsheth2 in https://github.com/datahub-project/datahub/pull/6461
- refactor(ui): Simplify base glossary page toolbar by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6469
- revert: mssql - lowercase db name in mssql ingestion by @hsheth2 in https://github.com/datahub-project/datahub/pull/6481
- build: remove
Jinja2
dependency fromsuperset
by @KulykDmytro in https://github.com/datahub-project/datahub/pull/6476 - fix(roles): allows role service to unassign roles by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6434
- fix(docs): update the Okta and Azure AD docs to clarify the point of ingesting users by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6465
- Highlighted the description text on search by @Ankit-Keshari-Vituity in https://github.com/datahub-project/datahub/pull/6400
- Ownership type is deprecated by @jakobhanna in https://github.com/datahub-project/datahub/pull/6477
- feat(ui): Adding Explore all button on home page search by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6468
- fix(ingest): fix athena and GE lint errors by @hsheth2 in https://github.com/datahub-project/datahub/pull/6482
- refactor(ingest): simplify stateful ingestion config by @hsheth2 in https://github.com/datahub-project/datahub/pull/6454
- docs(ingest/tableau): required permissions + doc formatting by @hsheth2 in https://github.com/datahub-project/datahub/pull/6484
- feat(ingest): presto - Adding presto source by @treff7es in https://github.com/datahub-project/datahub/pull/6459
- fix(ui) Fix lineage graph rendering with duplicate nodes by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6480
- docs(cypress): adding local cypress running instructions by @gabe-lyons in https://github.com/datahub-project/datahub/pull/6492
- fix(managed ingestion): updating snowflake schema pattern placeholder text by @gabe-lyons in https://github.com/datahub-project/datahub/pull/6493
- feat(ui): Adding External URLs to search preview for Dataset, Container, DataFlow, DataJob by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6496
- fix(ingest/tableau): check
tableName
existence on datasource response by @lustefaniak in https://github.com/datahub-project/datahub/pull/6478 - fix(build): do not use neo4j for dev by @anshbansal in https://github.com/datahub-project/datahub/pull/6501
- docs(gms): update search example, do not use deprecated clause by @mayurinehate in https://github.com/datahub-project/datahub/pull/6340
- feat(ingest): add stateful ingestion support to looker and lookml source by @mayurinehate in https://github.com/datahub-project/datahub/pull/6443
- feat(ingest): dbt cloud integration by @hsheth2 in https://github.com/datahub-project/datahub/pull/6323
- fix(tableau): extra defensive error-handling by @hsheth2 in https://github.com/datahub-project/datahub/pull/6503
- fix(ingest): remove redundant types by @hsheth2 in https://github.com/datahub-project/datahub/pull/6486
- fix(ingest/snowflake): fix lineage allow/deny pattern typo by @hsheth2 in https://github.com/datahub-project/datahub/pull/6506
- fix(docs): add missing docs for 0.9.1 by @anshbansal in https://github.com/datahub-project/datahub/pull/6515
- feat(ui): Introducing Share Button on Entity Pages by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6450
- Added I AM auth for Opensearch by @syedzoherer in https://github.com/datahub-project/datahub/pull/6370
- fix(ingest): correctly handle transformer patch semantics by @hsheth2 in https://github.com/datahub-project/datahub/pull/6505
- feat(ingest/csv-enrich): handle BOM character by @hsheth2 in https://github.com/datahub-project/datahub/pull/6509
- feat(airflow): support kafka hook in the airflow plugin by @hsheth2 in https://github.com/datahub-project/datahub/pull/6508
- fix(patch): cover case where patch is used to create an entity by @RyanHolstien in https://github.com/datahub-project/datahub/pull/6504
- build(deps): bump loader-utils from 2.0.0 to 2.0.4 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/6452
- fix(ingest): add alias for bigquery-beta by @hsheth2 in https://github.com/datahub-project/datahub/pull/6521
- feat(ingest): add config for ingesting delta table without files by @mayurinehate in https://github.com/datahub-project/datahub/pull/6403
- fix(ingest): fix typo in unique count profiling by @mayurinehate in https://github.com/datahub-project/datahub/pull/6517
- fix(ui) Fix roles not always displaying on page load by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6524
- feat(datahub-upgrade): Added msk IAM auth as a build dependency. by @pghazanfari in https://github.com/datahub-project/datahub/pull/6439
- feat(kafka-setup): Added support for MSK IAM authentication. by @pghazanfari in https://github.com/datahub-project/datahub/pull/6435
- Added sorting method to fieldpath column of schema tab by @Ankit-Keshari-Vituity in https://github.com/datahub-project/datahub/pull/6510
- fix(ingest): make kafka emit callback optional by @hsheth2 in https://github.com/datahub-project/datahub/pull/6525
- feat(ingest): automated term classification for snowflake by @mayurinehate in https://github.com/datahub-project/datahub/pull/6376
- fix(ingest): fix typo in urn utilities by @bskim45 in https://github.com/datahub-project/datahub/pull/6520
- fix(ingest): fix trino properties and tests by @mayurinehate in https://github.com/datahub-project/datahub/pull/6518
- fix(build): remove warnings in github actions by @anshbansal in https://github.com/datahub-project/datahub/pull/6512
- fix(security): Bump ranger plugin commons dependency by @pedro93 in https://github.com/datahub-project/datahub/pull/6535
- fix(ingest): kafka - properly picking doc from union type by @treff7es in https://github.com/datahub-project/datahub/pull/6472
- feat(ingest): disable stateful_ingestion fail-safe by default by @hsheth2 in https://github.com/datahub-project/datahub/pull/6537
- fix(ingest/airflow): respect enabled flag in airflow plugin by @hsheth2 in https://github.com/datahub-project/datahub/pull/6528
- refactor(ui): Adding apollo caching to manage domains page. by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6494
- refactor(recommendations): Filtering for specific entity types in recommendations by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6538
- fix(ingest): handle groupby custom label case by @phongvu99 in https://github.com/datahub-project/datahub/pull/6456
- build(ingest): support flake8 6.0.0 by @hsheth2 in https://github.com/datahub-project/datahub/pull/6540
- fix(ui) Wrap schema field descriptions to allow read more/less always by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6541
- fix(ui) Display duplicate nodes in lineage viz by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6526
- style(ingest): fix lint checks for superset by @mayurinehate in https://github.com/datahub-project/datahub/pull/6548
- fix(envs): remove DATASET_ENABLE_SCSI stale env var by @szalai1 in https://github.com/datahub-project/datahub/pull/6546
- feat(upgrade): Make restore from backup logic generic by @pedro93 in https://github.com/datahub-project/datahub/pull/6536
- feat(ingest): refractor classification mixin, support new infotypes by @mayurinehate in https://github.com/datahub-project/datahub/pull/6545
- fix(ingest): bigquery - missing sqlalchemy dep and row count fix by @treff7es in https://github.com/datahub-project/datahub/pull/6553
- fix(ingest): bigquery - Fixing querying non-date partition columns in profiling by @treff7es in https://github.com/datahub-project/datahub/pull/6554
- feat(ingest): powerbi # scan all accessible workspaces by @looppi in https://github.com/datahub-project/datahub/pull/6441
- fix(ingest): bigquery - Setting partition id for profiling data and project_id fix by @treff7es in https://github.com/datahub-project/datahub/pull/6558
- fix(gms): fix java.lang.NoClassDefFoundError: com/sun/syndication/io/FeedException for apache-ranger authorizer by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6560
- feat(ui): Add Test Connection Support for BigQuery ingestion source by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6543
- fix(contrib): Update base python image for es7-upgrade by @david-leifker in https://github.com/datahub-project/datahub/pull/6562
- fix(ingest): handle docker-compose version
v
prefix by @hsheth2 in https://github.com/datahub-project/datahub/pull/6561 - docs(ingest/kafka): add field descriptions of kafka-related configs to pydantic by @mmmeeedddsss in https://github.com/datahub-project/datahub/pull/6559
- feat(platform): Support @Searchable + @Relationship Annotations for Timeseries Aspects by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6455
- feat(models): Adding 'created', 'lastModified' timestamp to Dataset, Container, Dashboard, Chart by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6527
- fix(ingest): set DataProcessInstance created ts to start time by @hsheth2 in https://github.com/datahub-project/datahub/pull/6566
- feat(docs-site): fast reload command for markdown edits by @hsheth2 in https://github.com/datahub-project/datahub/pull/6539
- fix(ingest): graceful error handling in snowflake classification by @mayurinehate in https://github.com/datahub-project/datahub/pull/6568
- ci(label): add smoke test label by @anshbansal in https://github.com/datahub-project/datahub/pull/6571
- fix(ingest): fix types changes in clickhouse sqlalchemy 0.2.3 by @mayurinehate in https://github.com/datahub-project/datahub/pull/6572
- fix(tests): Misc updates for tests, auth log level, and quickstart by @david-leifker in https://github.com/datahub-project/datahub/pull/6491
- feat(ui) Add owner to dataset - allow same owner with a different type by @rtekal in https://github.com/datahub-project/datahub/pull/6463
- fix(verions): Update opentelemetry and updates from pr-5239 by @david-leifker in https://github.com/datahub-project/datahub/pull/6563
- refactor(airflow): remove verbose log from airflow plugin by @bskim45 in https://github.com/datahub-project/datahub/pull/6516
- feat(cli): remove inconsistency check command by @anshbansal in https://github.com/datahub-project/datahub/pull/6569
- fix(ingest): restrict snowflake's sqlalchemy dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/6579
- docs(notes): add release notes for v0.1.69 managed DataHub by @anshbansal in https://github.com/datahub-project/datahub/pull/6573
- fix(test): fix delete smoke test by @david-leifker in https://github.com/datahub-project/datahub/pull/6585
New Contributors
- @wangsaisai made their first contribution in https://github.com/datahub-project/datahub/pull/6343
- @stanbaker made their first contribution in https://github.com/datahub-project/datahub/pull/6287
- @lurecas made their first contribution in https://github.com/datahub-project/datahub/pull/6053
- @Reilman79 made their first contribution in https://github.com/datahub-project/datahub/pull/6396
- @LavinaVRovine made their first contribution in https://github.com/datahub-project/datahub/pull/6134
- @KulykDmytro made their first contribution in https://github.com/datahub-project/datahub/pull/6433
- @jakobhanna made their first contribution in https://github.com/datahub-project/datahub/pull/6477
- @lustefaniak made their first contribution in https://github.com/datahub-project/datahub/pull/6478
- @syedzoherer made their first contribution in https://github.com/datahub-project/datahub/pull/6370
- @phongvu99 made their first contribution in https://github.com/datahub-project/datahub/pull/6456
- @looppi made their first contribution in https://github.com/datahub-project/datahub/pull/6441
- @rtekal made their first contribution in https://github.com/datahub-project/datahub/pull/6463
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.9.2...v0.9.3 V0.9.2 # Release Highlights
User Experience
Metadata Ingestion
New ingestion source PowerBI Report Server
DataHub Docs Site
What's Changed
- feat(change-event): add change events for DataProcessInstanceRunEvent by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/6320
- Worked on the Usage column & Lineage Drawer by @Ankit-Keshari-Vituity in https://github.com/datahub-project/datahub/pull/6290
- refactor(bootstrap data): Adding assertions data to bootstrap by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6324
- fix(ui) Disable deleting Term Groups with children by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6332
- feat(ingestion): business-glossary - Add values and relatedTerms support by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6148
- fix(ui): two small ux fixes by @gabe-lyons in https://github.com/datahub-project/datahub/pull/6335
- feat(ingest): add new ingestion source PowerBI Report Server by @alcoccoque in https://github.com/datahub-project/datahub/pull/5369
- feat(ingest): drop plugin support for airflow 1.x by @hsheth2 in https://github.com/datahub-project/datahub/pull/6331
- fix(ingest): fix invalid schema field urns with empty field path by @mayurinehate in https://github.com/datahub-project/datahu
DataHub v0.10.0
Released on 2023-02-07 by @david-leifker.
Release Highlights
Potential Downtime
This release introduces substantial improvements to search functionality which require reindexing indices.
During the reindexing:
- a system-update job will set indices to read-only and create a backup/clone of each index
- new components will be prevented from start-up until the reindex completes
- Helm deployments will go into read-only mode and new ingestion runs will fail
This process can take anywhere from 5 minutes to multiple hours; as rough estimate, please expect it to take 1 hour for every 2.3 million entities. After the reindex is complete, please check your ingestion run to re-run any that did not complete.
If you are deploying containers yourself
If you're deploying the Docker containers yourself (without Helm or Docker-Compose Quickstart), then you'll need to ensure that you first run the acryldata/datahub-upgrade
docker image (v0.10.0 tag) with the following environment variables enabled.
Then, run the container this with the command
docker run acryldata/datahub-upgrade:v0.10.0 -u SystemUpdate
For the full set of environment variables required, check out the default docker.env provided for Docker Compose deployments.
This will run the required reindex against your elasticsearch instance, after which other DataHub components should start correctly. If you do not run the datahub-upgrade
container successfully, other components in the stack will fail to start correctly.
User Experience
We have some really exciting improvements to the DataHub user experience in this release!
Improved documentation editor, contributed by @ngamanda and the Grab Team. This work provides a much more intuitive documentation editing experience within the UI, providing “what you see is what you get” formatting & removing the need for markdown expertise.
Additionally, you can easily:
- Add links to other entities/users within DataHub
- embed and resize tables & images
- toggle between font sizes and formats
- embed syntax-highlighted code blocks
<img src="https://user-images.githubusercontent.com/114954101/217367791-3d392ae4-f422-4188-8d3c-768cb7c120ea.png" width="800">
Filter lineage graphs based on time windows You can now easily see the full lineage graph of an entity at a specific point in time. This makes it much easier to understand how interdependencies have evolved over time and to troubleshoot data issues in the past.
Improvements in Search As noted above, we have rolled out substantial improvements to Search functionality, making it easier than ever for end-user to find the entities that matter most. This release includes:
- Stemm & Synonyms
- Search by full or partial URN
- Autocomplete improvements
- Quoted search analyzer for exact & prefix match
Metadata Ingestion
Here are some of the most notable ingestion-related improvements:
- Redshift: You can now extract lineage information from unload queries – thanks for the contrib, @mmmeeedddsss
- PowerBI: Ingestion now maps Workspaces to DataHub Containers – thanks for the contrib, @looppi
- BigQuery: You can now extract lineage metadata from the Catalog API – thanks for the crontrib, @PatrickfBraz
- Glue: Ingestion now uses table name as the human-readable name – thanks for the contrib, @danielcmessias
Developer Experience
- This release introduces DataHub Lite - a new experimental lightweight implementation of DataHub. It is intended to enable local developer tooling use-cases such as simple access to metadata for scripts and other tools. DataHub Lite is compatible with the DataHub metadata format and all the ingestion connectors that DataHub supports. Checkout the docs here.
Breaking Changes
[#7103](https://github.com/datahub-project/datahub/pull/7103) This should only impact users who have configured explicit non-default names for DataHub's Kafka topics. The environment variables used to configure Kafka topics for DataHub used in the kafka-setup docker image have been updated to be in-line with other DataHub components, for more info see our docs on Configuring Kafka in DataHub . They have been suffixed with _TOPIC where as now the correct suffix is _TOPIC_NAME. This change should not affect any user who is using default Kafka names.
What's Changed
- fix(ci): only scan on master branch by @anshbansal in https://github.com/datahub-project/datahub/pull/7047
- fix(ci): use trivy offline scanning by @anshbansal in https://github.com/datahub-project/datahub/pull/7050
- docs(get-started) Simplify copy on Get Started landing page by @maggiehays in https://github.com/datahub-project/datahub/pull/7043
- fix(ingest/kafka): fix ResourceType import error for confluent_kafka<1.9.0 by @mayurinehate in https://github.com/datahub-project/datahub/pull/7046
- docs(dbt): fix indentation in dbt meta mapping docs by @jx2lee in https://github.com/datahub-project/datahub/pull/7045
- fix(ingest): temporarily disable vertica tests by @hsheth2 in https://github.com/datahub-project/datahub/pull/7059
- feat(editor): improve documentation editor using Remirror by @ngamanda in https://github.com/datahub-project/datahub/pull/6631
- fix(bootstrap): add EDIT_LINEAGE privilege to some default policies by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7060
- feat(ingest): add entity registry in codegen by @hsheth2 in https://github.com/datahub-project/datahub/pull/6984
- feat(ingest): extract powerbi endorsements to tags by @looppi in https://github.com/datahub-project/datahub/pull/6638
- feat(ingestion): pull metabase database, schema names from raw query and api by @remisalmon in https://github.com/datahub-project/datahub/pull/7039
- fix(ingest): support multiple entity_registry sections by @hsheth2 in https://github.com/datahub-project/datahub/pull/7066
- ci(ingest): add flag to skip tests but run codegen during release by @hsheth2 in https://github.com/datahub-project/datahub/pull/7067
- fix(ingest): preserve dbt column name casing by @hsheth2 in https://github.com/datahub-project/datahub/pull/7063
- fix(ingest/tableau): fix node limit exceeded error for workbooks query by @mayurinehate in https://github.com/datahub-project/datahub/pull/7068
- fix(build/airflow): Fixing gradlew path by @treff7es in https://github.com/datahub-project/datahub/pull/7069
- feat(ingest): support snapshots in dbt and dbt-cloud by @hsheth2 in https://github.com/datahub-project/datahub/pull/7062
- fix(ui) Fix duplicate schema field rendering with siblings by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7057
- refactor(ingest/athena): Replace
s3_staging_dir
parameter in Athena source withquery_result_location
by @bossenti in https://github.com/datahub-project/datahub/pull/7044 - feat(ingest): fix handling of unions with aliases in post restli conversion by @hsheth2 in https://github.com/datahub-project/datahub/pull/7058
- fix(ui) Make checkboxes in ingestion forms easier to see by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7061
- fix(ingest): support git clone of non-github repos by @hsheth2 in https://github.com/datahub-project/datahub/pull/7065
- feat(ingest): reporting revamp, part 1 by @hsheth2 in https://github.com/datahub-project/datahub/pull/7031
- fix(secret-service): fix default encrypt key by @david-leifker in https://github.com/datahub-project/datahub/pull/7074
- feat(datahub-lite): introduces a new experimental lightweight impleme… by @shirshanka in https://github.com/datahub-project/datahub/pull/7052
- feat(datahub-lite): adding tab completion, small serialization fixes by @shirshanka in https://github.com/datahub-project/datahub/pull/7079
- docs: add docs for managed DataHub v0.1.72 by @anshbansal in https://github.com/datahub-project/datahub/pull/7070
- docs(readme): add inovex as adopter by @DSchmidtDev in https://github.com/datahub-project/datahub/pull/7077
- docs: add warning about clearing cookies for login by @anshbansal in https://github.com/datahub-project/datahub/pull/7084
- feat(cache): add hazelcast distributed cache option by @RyanHolstien in https://github.com/datahub-project/datahub/pull/6645
- docs(datahub-lite): small improvement for zsh tab completion by @shirshanka in https://github.com/datahub-project/datahub/pull/7085
- fix(ingest/bigquery): clear stateful ingestion correctly by @hsheth2 in https://github.com/datahub-project/datahub/pull/7075
- fix(graphql): Return with appropriate status code instead of stacktrace by @szalai1 in https://github.com/datahub-project/datahub/pull/7086
- fix(sso): Clear cookies on SSO redirect error by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7088
- fix(docs): add missing mutation literal by @ruedigerblock in https://github.com/datahub-project/datahub/pull/7082
- fix(ui): display the correct access token expiry in AccessTokenModal by @ngamanda in https://github.com/datahub-project/datahub/pull/7078
- fix(cli/lite): fix datahub lite serve command by @hsheth2 in https://github.com/datahub-project/datahub/pull/7089
- fix(profiling): Fix syntax for APPROX_COUNT_DISTINCT on bigquery and snowflake by @feljen in https://github.com/datahub-project/datahub/pull/7087
- fix(ingest): fix logic error of google protobuf wrapper type. by @wngus606 in https://github.com/datahub-project/datahub/pull/7076
- feat(ui): Documentation Editor Improvements by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7072
- fix(uri): marks uri field as deprecated, removes problem code, and adds coercer for usages of URI typeref by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7093
- fix(build): postgres docker secret by @david-leifker in https://github.com/datahub-project/datahub/pull/7092
- fix(ingest/snowflake): handle corrupted snowflake OCSP cache file by @hsheth2 in https://github.com/datahub-project/datahub/pull/7095
- refactor(ingest): Refactoring container creation to common place by @treff7es in https://github.com/datahub-project/datahub/pull/6877
- feat(ingest): move datahub-lite to optional dep and add shim when missing by @hsheth2 in https://github.com/datahub-project/datahub/pull/7097
- fix(docker): support non amd64 dockerize in setup containers by @tonycsoka in https://github.com/datahub-project/datahub/pull/7091
- test(ingest): fix kafka admin client mocking by @hsheth2 in https://github.com/datahub-project/datahub/pull/7098
- fix(build): Fix postgres setup gha by @david-leifker in https://github.com/datahub-project/datahub/pull/7104
- fix(ingest/profile): properly quoting approx_count_distinct by @treff7es in https://github.com/datahub-project/datahub/pull/7101
- style(models): Replaces non-ASCII charactes in pdl files with ASCII c… by @nmbryant in https://github.com/datahub-project/datahub/pull/7105
- feat(ingest): hide cartesian product warnings in GE profiler by @hsheth2 in https://github.com/datahub-project/datahub/pull/7096
- feat(ingest): add removing partition pattern in spark lineage by @ssilb4 in https://github.com/datahub-project/datahub/pull/6605
- feat(redshift): Fetch lineage from unload queries by @mmmeeedddsss in https://github.com/datahub-project/datahub/pull/7041
- fix(ci): do not confirm on force for deletion by @anshbansal in https://github.com/datahub-project/datahub/pull/7106
- fix(analytics): add missing usage events causing warning in logs by @anshbansal in https://github.com/datahub-project/datahub/pull/7109
- feat(quickstart): Remove kafka-setup as a hard deployment requirement by @pedro93 in https://github.com/datahub-project/datahub/pull/7073
- fix(tests): Fixing add_users smoke test by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7116
- chore(deps): bump ua-parser-js from 0.7.32 to 0.7.33 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/7122
- docs(gms): clarify behavior of soft deletion in UI by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7117
- fix(kafka-setup): Make topic name consistent with other images by @pedro93 in https://github.com/datahub-project/datahub/pull/7103
- chore(deps): bump ua-parser-js from 0.7.32 to 0.7.33 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/7123
- feat(ingest): powerbi # add powerbi workspaces to containers by @looppi in https://github.com/datahub-project/datahub/pull/6532
- fix(diffMode): prevent misconfiguration of diff mode by @RyanHolstien in https://github.com/datahub-project/datahub/pull/7127
- fix(ui) Display glossary term name in analytics page properly by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7128
- fix(ui): only use visible and enabled tabs for selected tab and routing in entity profiles by @Masterchen09 in https://github.com/datahub-project/datahub/pull/6629
- fix(htrace): remove htrace jar by @szalai1 in https://github.com/datahub-project/datahub/pull/7126
- feat(datahub-lite): simplify get response by @shirshanka in https://github.com/datahub-project/datahub/pull/7131
- fix(doc/biquery): Updating bigquery capability doc by @treff7es in https://github.com/datahub-project/datahub/pull/7136
- fix(ci): do not fail fast for matrix runs by @anshbansal in https://github.com/datahub-project/datahub/pull/7132
- refactor(ui): refactor capitalization of platform name and sub types by @Masterchen09 in https://github.com/datahub-project/datahub/pull/7099
- refactor(cli): extract method, change wording by @anshbansal in https://github.com/datahub-project/datahub/pull/7134
- docs(lineage): Updating Lineage feature guide by @maggiehays in https://github.com/datahub-project/datahub/pull/6257
- removing WIP by @laulpogan in https://github.com/datahub-project/datahub/pull/7140
- docs(oidc): Updating + improving docs around OIDC configuration by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7141
- fix(ingest): add message proto check by @tinolyu in https://github.com/datahub-project/datahub/pull/7130
- fix(ingest): use snowflake median function in profiling by @hsheth2 in https://github.com/datahub-project/datahub/pull/6987
- feat(ui): allow removing parentNodes of Glossary Nodes and Glossary Terms by @ngamanda in https://github.com/datahub-project/datahub/pull/7135
- feat(ui) Add new embedded profile to be displayed in extension by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7113
- feat(ingest): add
--log-file
option and show CLI logs in UI report by @hsheth2 in https://github.com/datahub-project/datahub/pull/7118 - fix(misc): NPE and GraphQL case fixes by @david-leifker in https://github.com/datahub-project/datahub/pull/7149
- fix(ingest/snowflake): fix regression in approx count distinct by @hsheth2 in https://github.com/datahub-project/datahub/pull/7146
- [docs] fix typo / add missing line for docker compose / attach overwriting system action config for confluent. by @kdongho in https://github.com/datahub-project/datahub/pull/7142
- reordering sidebar and adding homepage to apis by @laulpogan in https://github.com/datahub-project/datahub/pull/7139
- fix(ingestion): powerbi # Not all arguments converted to string by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7157
- fix(ui): Sort top users by their query count in datasets stats tab by @jaykadambi in https://github.com/datahub-project/datahub/pull/7148
- refactor(ui): Updates to Manual Lineage search by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7151
- feat(ui) Build entity doesn't exist page for entity profiles by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7150
- ci(ingest): fix broken CI workflow for metadata-ingestion by @hsheth2 in https://github.com/datahub-project/datahub/pull/7161
- fix(ingest): azuread group mapping do not stop ingestion by @anshbansal in https://github.com/datahub-project/datahub/pull/7169
- fix(docs): Fixes links to docs templates by @viniciusdsmello in https://github.com/datahub-project/datahub/pull/7171
- refactor(ui ingest): Allow enabling / disabling ingestion schedule easily by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7162
- fix(ingest): switch various sources to
auto_stale_entity_removal
helper by @hsheth2 in https://github.com/datahub-project/datahub/pull/7158 - docs(townhall) Update Townhall History doc by @maggiehays in https://github.com/datahub-project/datahub/pull/7180
- test(ingest/delta-lake): fix spurious directory creation by @hsheth2 in https://github.com/datahub-project/datahub/pull/7179
- feat: add a linter for github actions workflows by @hsheth2 in https://github.com/datahub-project/datahub/pull/7178
- fix(quickstart): adding back kafka-setup by @szalai1 in https://github.com/datahub-project/datahub/pull/7181
- fix(docs) Fix broken links in ingestion docs by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7183
- fix(ingest/GX): fix snowflake urn generated from connection string by @mayurinehate in https://github.com/datahub-project/datahub/pull/7173
- feat(ingest): switch dbt to use
auto_stale_entity_removal
by @hsheth2 in https://github.com/datahub-project/datahub/pull/7160 - fix(ingest): fix issue in glue tests by @hsheth2 in https://github.com/datahub-project/datahub/pull/7185
- fix(log): logging timestamp in ISO8601 format instead of time by @anshbansal in https://github.com/datahub-project/datahub/pull/7188
- feat(ingest): bigquery - extracts lineage metadata from catalog api by @PatrickfBraz in https://github.com/datahub-project/datahub/pull/7137
- fix(ingest/tableau): show warning about token expiry for PATs by @hsheth2 in https://github.com/datahub-project/datahub/pull/7187
- fix(ingest/vertica): Fixing missing container properties by @treff7es in https://github.com/datahub-project/datahub/pull/7197
- chore(deps): bump Netty from 4.1.85.Final to 4.1.86.Final by @janhicken in https://github.com/datahub-project/datahub/pull/7191
- docs(ingestion): powerbi # Add permission for DAX and mashup expressions by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7195
- feat(elasticsearch): Elasticsearch improvements by @david-leifker in https://github.com/datahub-project/datahub/pull/6894
- fix(test): spark-lineage # build task as dependency of integrationTest by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7189
- chore(sample): add status removed aspect for sample data by @anshbansal in https://github.com/datahub-project/datahub/pull/7203
- docs(managed datahub): release notes for v0.1.73 by @anshbansal in https://github.com/datahub-project/datahub/pull/7194
- fix(bootstrapdata): update timestamp to be in the last 1 year by @szalai1 in https://github.com/datahub-project/datahub/pull/7206
- fix(ingest/bigquery): quoting for APPROX_COUNT_DISTINCT in BigQuery by @mryorik in https://github.com/datahub-project/datahub/pull/7207
- fix(versioning): Ensure that CLI version is always dot-delimited even in minor release versions by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7200
- fix(test): missing variables in test causing error in logs by @anshbansal in https://github.com/datahub-project/datahub/pull/7210
- feat(mlModel): mark downstream jobs as ml model downstreams lineage by @mayurinehate in https://github.com/datahub-project/datahub/pull/7205
- ci(): fix datahub-upgrade quickstart regression by @hsheth2 in https://github.com/datahub-project/datahub/pull/7217
- feat(ingest): Add custom properties to the ldap ingestion by @bda618 in https://github.com/datahub-project/datahub/pull/7125
- fix(ingest): upgrade feast to avoid build issues by @hsheth2 in https://github.com/datahub-project/datahub/pull/7218
- fix(ui) Increase the number of assertions that we query for in tab by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7215
- fix(ci): trivy code scanning fix by @anshbansal in https://github.com/datahub-project/datahub/pull/7232
- feat(glue): Use table name as human-readable name for Glue ingestion by @danielcmessias in https://github.com/datahub-project/datahub/pull/7213
- feat(ui): Supporting display of columns and storage count in previews by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7198
- fix(gms): Fixes delete references for single relationship aspects by @pedro93 in https://github.com/datahub-project/datahub/pull/7211
- docs(ingest/lineage): clarify name field in entity config for file based lineage by @mayurinehate in https://github.com/datahub-project/datahub/pull/7225
- fix(ui): typo 'Documenataion' by @vojtechneradatos in https://github.com/datahub-project/datahub/pull/7227
- fix(cli/delete): skip references prompt if deleting an aspect by @hsheth2 in https://github.com/datahub-project/datahub/pull/7220
- fix(ingest/tableau): implement workbook_page_size parameter by @hsheth2 in https://github.com/datahub-project/datahub/pull/7216
- fix(gms): Corrects MCP generation in async mode by @pedro93 in https://github.com/datahub-project/datahub/pull/7214
- fix(ingest): redshift # build late binding view lineage when sql written in upper case by @looppi in https://github.com/datahub-project/datahub/pull/7223
- fix(siblings) Fix editing of schema fields for siblings with unequal schemas by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7199
- fix(ingest-idp): emit empty GroupMembership when there are no groups by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7196
- feat(lineage): add time filtering for lineage edges by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7159
- chore(deps): bump http-cache-semantics from 4.1.0 to 4.1.1 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/7230
- refactor(docs): Minor language updates for kafka source doc header by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7237
- docs(website): fix feature availability dark mode styles by @jeffmerrick in https://github.com/datahub-project/datahub/pull/7233
- chore(log/docs): improve error log, docs by @anshbansal in https://github.com/datahub-project/datahub/pull/7239
- fix(dev.sh): Add context to kafka-setup build by @szalai1 in https://github.com/datahub-project/datahub/pull/7234
- feat(cli): improve docker quickstart by @hsheth2 in https://github.com/datahub-project/datahub/pull/7184
- fix(elasticsearch): fix orphan index clean up pattern, consistent top… by @david-leifker in https://github.com/datahub-project/datahub/pull/7242
- chore(deps): bump http-cache-semantics from 4.1.0 to 4.1.1 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/7231
- Update data_platforms.json by @RainerGa in https://github.com/datahub-project/datahub/pull/7244
- fix(autocomplete): Use normal properties name instead of urn name in autocomplete by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7236
- fix(frontend logs): Silencing harmless log messages (and adding path for future) by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7254
- fix(docker): fix ability to use non-default reg by @david-leifker in https://github.com/datahub-project/datahub/pull/7250
- logging(elasticsearch): improve messaging in orphan index detection by @david-leifker in https://github.com/datahub-project/datahub/pull/7246
- chore(ci): update base image dependencies by @anshbansal in https://github.com/datahub-project/datahub/pull/7248
- docs(graphql): remove reference of non-existent gms.graphql by @mayurinehate in https://github.com/datahub-project/datahub/pull/7240
- Add graphql error and call metrics at startuptime by @szalai1 in https://github.com/datahub-project/datahub/pull/7226
- docs(ingest): update kafka connect doc, simplify starter recipe by @mayurinehate in https://github.com/datahub-project/datahub/pull/7243
- fix(cli): update message when pulling docker images by @mayurinehate in https://github.com/datahub-project/datahub/pull/7241
- fix(ingest/tableau): handle missing query in tableau views by @hsheth2 in https://github.com/datahub-project/datahub/pull/7186
- feat(ingest/s3): use latest file to infer schema metadata by @mayurinehate in https://github.com/datahub-project/datahub/pull/7202
- fix(schema-blame): check if list of ChangeTransactions is empty before processing by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7263
- fix(change-events): guard against NPE's by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7264
- fix(docker): add env variable to control mysql setup image, sort dock… by @shirshanka in https://github.com/datahub-project/datahub/pull/7266
- chore(logs): clean logs scanning location by @anshbansal in https://github.com/datahub-project/datahub/pull/7261
- fix(profile): use department name if available by @anshbansal in https://github.com/datahub-project/datahub/pull/7257
- fix(async ingest): Fix async ingest path by @pedro93 in https://github.com/datahub-project/datahub/pull/7269
- fix(compose): fix override file missing container by @david-leifker in https://github.com/datahub-project/datahub/pull/7270
- fix(ui): fix spacing on share buttons by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7272
New Contributors
- @bossenti made their first contribution in https://github.com/datahub-project/datahub/pull/7044
- @ruedigerblock made their first contribution in https://github.com/datahub-project/datahub/pull/7082
- @feljen made their first contribution in https://github.com/datahub-project/datahub/pull/7087
- @tonycsoka made their first contribution in https://github.com/datahub-project/datahub/pull/7091
- @tinolyu made their first contribution in https://github.com/datahub-project/datahub/pull/7130
- @kdongho made their first contribution in https://github.com/datahub-project/datahub/pull/7142
- @jaykadambi made their first contribution in https://github.com/datahub-project/datahub/pull/7148
- @viniciusdsmello made their first contribution in https://github.com/datahub-project/datahub/pull/7171
- @mryorik made their first contribution in https://github.com/datahub-project/datahub/pull/7207
- @danielcmessias made their first contribution in https://github.com/datahub-project/datahub/pull/7213
- @vojtechneradatos made their first contribution in https://github.com/datahub-project/datahub/pull/7227
- @RainerGa made their first contribution in https://github.com/datahub-project/datahub/pull/7244
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.9.6...v0.10.0
DataHub v0.9.6.1
Released on 2023-01-31 by @david-leifker.
Release Highlights
Please upgrade from 0.9.6 ASAP to avoid ongoing issues creating and using secrets.
Important Release Notes
With this release, if you are using Neo4J as your graph implementation, you need to set:
GRAPH_SERVICE_DIFF_MODE_ENABLED=false
For GMS (or MAE Consumer for standalone mode).
Bug fix for secrets encryption
- Prevents decryption errors for existing secrets
- Affects reading ingestion secret created with a previous release
- Affects native user password validation
What's Changed
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.9.6...v0.9.6.1
DataHub v0.9.6
Released on 2023-01-13 by @maggiehays.
⚠️ This Release has been patched. Please upgrade to 0.9.6.1 ⚠️
As of January 19th, 2023 0.9.6.1 is now the official release build, and should be used over 0.9.6. Upgrade to 0.9.6.1 when possible to avoid issues creating and using secrets.
</br></br>
Release Highlights
Important Release Notes
With this release, if you are using Neo4J as your graph implementation, you need to set:
GRAPH_SERVICE_DIFF_MODE_ENABLED=false
For GMS (or MAE Consumer for standalone mode).
User Experience
- We now support embedding Dashboards, Charts, and Datasets. This allows us to do things like directly embed Looker / Tableau / Mode / Redash Looks, Dashboards, Explores into the Dataset pages themselves.
- [Experimental] You can now customize the number of queries displayed on the Query tab of a Dataset entity
- Improved error messaging for bulk editing via the UI
Metadata Ingestion
- Update to data profiling to allow configurable number of sample values to be returned
- Postgres ingestion now supports emitting lineage edges for Views - shoutout to @LucasRoesler for the contribution!
- Snowflake ingestion now supports extracting tags - shoutout to @frsann for the contribution!
- Vertica ingestion now supports projections and lineage- thanks for the contribution, @vishalkSimplify!
- Glue ingestion now emits an s3 lineage edge when data was written with an s3a/s3n client - thanks for the contribution, @danielli-ziprecruiter!
Developer Experience
- Fixes quickstart/docker compose issues for M1 machines
- Improvements in reliability and performance of the Restli Service endpoints for ingestion:
- Scale Restli Service thread pool based on CPU
- Add retry (exp backoff) to Restli Entity Client
- MCE no longer relies on GMS for Restli service
- Converted Restli Service from standalone servlet to Spring injectable
- Docker build externalized (significantly faster on m1, <7 minute build times, based on this)
- Frontend asset generation refactor (causing tests to fail intermittently)
What's Changed
- feat(ingest): add pydantic helper for removed fields by @hsheth2 in https://github.com/datahub-project/datahub/pull/6853
- chore(0.9.5): Bump defaults for release v0.9.5 by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6856
- Revert "fix(ci): remove warnings due to deprecated action" by @anshbansal in https://github.com/datahub-project/datahub/pull/6857
- refactor(restli-mce-consumer) by @david-leifker in https://github.com/datahub-project/datahub/pull/6744
- fix(ci): reduce smoke test run time by @anshbansal in https://github.com/datahub-project/datahub/pull/6841
- fix(security): require signed/encrypted jwt tokens by @david-leifker in https://github.com/datahub-project/datahub/pull/6565
- feat(ingest): update profiling to fetch configurable number of sample values by @mayurinehate in https://github.com/datahub-project/datahub/pull/6859
- feat(ingest/airflow): support raw dataset urns in airflow lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/6854
- refactor(graphql): make graphqlengine easier to use by @anshbansal in https://github.com/datahub-project/datahub/pull/6865
- fix(kafka): datahub-upgrade job by @david-leifker in https://github.com/datahub-project/datahub/pull/6864
- feat(ingest): pass timeout config in kafka admin client api calls by @mayurinehate in https://github.com/datahub-project/datahub/pull/6863
- chore(ingest): loosen requirements file by @hsheth2 in https://github.com/datahub-project/datahub/pull/6867
- feat(ingest): upgrade pydantic version by @cccs-eric in https://github.com/datahub-project/datahub/pull/6858
- fix(elasticsearch): fixes out of order runId writes by @david-leifker in https://github.com/datahub-project/datahub/pull/6845
- chore(ingest): loosen additional requirements by @hsheth2 in https://github.com/datahub-project/datahub/pull/6868
- feat(ingest): bigquery/snowflake - Store last profile date in state by @treff7es in https://github.com/datahub-project/datahub/pull/6832
- docs(google-analytics): Correct grammatical error in README.md by @jx2lee in https://github.com/datahub-project/datahub/pull/6870
- feat(CI): add venv caching by @szalai1 in https://github.com/datahub-project/datahub/pull/6843
- feat(ingest/snowflake): handle failures gracefully and raise permission failures by @mayurinehate in https://github.com/datahub-project/datahub/pull/6748
- fix(runid): always update runid, except when queued by @david-leifker in https://github.com/datahub-project/datahub/pull/6876
- fix(ingest): conditionally include env in assertion guid by @hsheth2 in https://github.com/datahub-project/datahub/pull/6811
- chore(ci): update dependencies docs-website by @anshbansal in https://github.com/datahub-project/datahub/pull/6871
- feat(ui) - Add a custom error message for bulk edit to add clarity by @mkamalas in https://github.com/datahub-project/datahub/pull/6775
- docs(adding users): Refreshing the docs for adding new DataHub Users by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6879
- test(mce-consumer): mockbeans by @david-leifker in https://github.com/datahub-project/datahub/pull/6878
- feat(ingest): avoid embedding serialized json in metadata files by @hsheth2 in https://github.com/datahub-project/datahub/pull/6742
- refactor(gradle): move the local docker registry to common location by @david-leifker in https://github.com/datahub-project/datahub/pull/6881
- refactor(smoke): use env variables by @anshbansal in https://github.com/datahub-project/datahub/pull/6866
- fix(lint): pin pydantic version by @anshbansal in https://github.com/datahub-project/datahub/pull/6886
- refactor(docs): Correctly spell elasticsearch in docs by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6880
- fix(ingest): okta undefined variable error by @anshbansal in https://github.com/datahub-project/datahub/pull/6882
- fix(ci): reduce flakiness in add_users, siblings smoke test by @anshbansal in https://github.com/datahub-project/datahub/pull/6883
- fix(ingest): fall back to default table comment method for all Trino query errors by @marvin-roesch in https://github.com/datahub-project/datahub/pull/6873
- test(misc): misc test updates by @david-leifker in https://github.com/datahub-project/datahub/pull/6890
- deprecate(ingest): bigquery - Removing bigquery-legacy source by @treff7es in https://github.com/datahub-project/datahub/pull/6851
- chore(ingest): remove inferred args to MCPW, part 1 by @hsheth2 in https://github.com/datahub-project/datahub/pull/6819
- test(ingest/kafka-connect): make docker setup more reliable by @hsheth2 in https://github.com/datahub-project/datahub/pull/6902
- fix(ingest): profiling (bigquery) - Address biquery profiling query error due to timestamp vs data mismatch by @treff7es in https://github.com/datahub-project/datahub/pull/6874
- fix(cli): Make datahub quickstart work with latest docker compose in M1 by @pedro93 in https://github.com/datahub-project/datahub/pull/6891
- fix(cli): fix delete urn cli bug + stricter type annotations by @hsheth2 in https://github.com/datahub-project/datahub/pull/6903
- fix(ingest/airflow): reorder imports to avoid cyclical dependencies by @stijndehaes in https://github.com/datahub-project/datahub/pull/6719
- feat: remove jq requirement + tweak modeldocgen args by @hsheth2 in https://github.com/datahub-project/datahub/pull/6904
- chore(ingest): loosen pyspark and pydeequ deps by @hsheth2 in https://github.com/datahub-project/datahub/pull/6908
- docs(ingest/looker): fix typos + update lookml github action example by @hsheth2 in https://github.com/datahub-project/datahub/pull/6910
- fix(ingest/metabase): use card_id in dashboard to chart lineage by @ccpypy in https://github.com/datahub-project/datahub/pull/6583
- fix(es-setup): create data stream on non-aws by @szalai1 in https://github.com/datahub-project/datahub/pull/6926
- Adding missing Platform logos by @maggiehays in https://github.com/datahub-project/datahub/pull/6892
- feat(ingestion): PowerBI# Improve PowerBI source ingestion by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6549
- Fix compose context for kafka-setup by @szalai1 in https://github.com/datahub-project/datahub/pull/6923
- feat(backend): Supporting Embeddable Previews for Dashboards, Charts, Datasets by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6875
- chore(deps): bump json5 from 2.2.1 to 2.2.3 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/6930
- chore(deps): bump json5 from 1.0.1 to 1.0.2 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/6931
- fix(ci): managed ingestion test fix by @anshbansal in https://github.com/datahub-project/datahub/pull/6946
- feat(ingest): add
include_table_location_lineage
flag for SQL common by @hsheth2 in https://github.com/datahub-project/datahub/pull/6934 - feat(ingest): allow extracting snowflake tags by @frsann in https://github.com/datahub-project/datahub/pull/6500
- chore(ingest): unpin pydantic dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/6909
- chore(ingest): partially revert pyspark dep from #6908 by @hsheth2 in https://github.com/datahub-project/datahub/pull/6954
- fix(ingest): use branch info when cloning git repos by @hsheth2 in https://github.com/datahub-project/datahub/pull/6937
- chore(ingest): remove inferred args to MCPW, part 2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/6905
- fix(ingest/unity): simplify MCP generation and reporting by @hsheth2 in https://github.com/datahub-project/datahub/pull/6911
- chore(ci): parallelise build and test workflow to reduce time by @anshbansal in https://github.com/datahub-project/datahub/pull/6949
- fix(frontend): sasl.client.callback.handler.class by @szalai1 in https://github.com/datahub-project/datahub/pull/6962
- chore(react): remove outdated cypress tests and dependency by @anshbansal in https://github.com/datahub-project/datahub/pull/6948
- fix(ci): restrict GE to fix build issues by @anshbansal in https://github.com/datahub-project/datahub/pull/6967
- feat(queries): [Experimental] Allow customization of # of queries in Query tab via env var by @gabe-lyons in https://github.com/datahub-project/datahub/pull/6964
- feat(ingest/postgres): emit lineage for postgres views by @LucasRoesler in https://github.com/datahub-project/datahub/pull/6953
- feat(ingest/vertica): support projections and lineage in vertica by @vishalkSimplify in https://github.com/datahub-project/datahub/pull/6785
- fix(ingest): add missing dep for powerbi by @hsheth2 in https://github.com/datahub-project/datahub/pull/6969
- Docs fixes week of 12 22 by @laulpogan in https://github.com/datahub-project/datahub/pull/6963
- fix(ingest): unfreeze bigquery/snowflake column dataclass by @mayurinehate in https://github.com/datahub-project/datahub/pull/6921
- chore(frontend) Remove unused dependencies from package.json by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6974
- chore: misc fixes by @anshbansal in https://github.com/datahub-project/datahub/pull/6966
- feat(ingest/glue): emit s3 lineage for s3a and s3n schemes by @danielli-ziprecruiter in https://github.com/datahub-project/datahub/pull/6788
- fix(kafka-setup): Make kafka-setup run with multiple threads by @pedro93 in https://github.com/datahub-project/datahub/pull/6970
- feat(ingest): mark database_alias and env as deprecated by @hsheth2 in https://github.com/datahub-project/datahub/pull/6901
- fix(docs): Updating Tag, Glossary Term docs to point to correct GraphQL methods by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6965
- chore(deps): bump certifi from 2020.12.5 to 2022.12.7 in /metadata-ingestion/src/datahub/ingestion/source/feast_image by @dependabot in https://github.com/datahub-project/datahub/pull/6979
- fix(ingest): profiling - Fixing issue with the wrong timestamp stored in check by @treff7es in https://github.com/datahub-project/datahub/pull/6978
- config(quickstart): enable auto-reindex for quickstart by @david-leifker in https://github.com/datahub-project/datahub/pull/6983
- feat(privileges) - Create a privilege to manage glossary children recursively by @mkamalas in https://github.com/datahub-project/datahub/pull/6731
- chore(ingest): finish removing feast-legacy by @hsheth2 in https://github.com/datahub-project/datahub/pull/6985
- feat(ingest): add import descriptions of two or more nested messages by @wngus606 in https://github.com/datahub-project/datahub/pull/6959
- feat(docs) Add feature guide for Manual Lineage by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6933
- docs(rfc): Serialising GMS Updates with Preconditions by @mattmatravers in https://github.com/datahub-project/datahub/pull/5818
- fix(ingest/kafka-connect) support newer version of debezium by @jaegwonseo in https://github.com/datahub-project/datahub/pull/6943
- fix(docs): build and broken snowflake docs fix by @anshbansal in https://github.com/datahub-project/datahub/pull/6997
- fix(ingest): bigquery - views in case more than 1 datasets with views by @anshbansal in https://github.com/datahub-project/datahub/pull/6995
- fix(docs): Renaming Business Glossary Doc by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7001
- fix(ingest/snowflake): fix type annotations + refactor get_connect_args by @hsheth2 in https://github.com/datahub-project/datahub/pull/7004
- fix(docs): Changing the platform event topic name in kafka custom topic docs by @blankon123 in https://github.com/datahub-project/datahub/pull/7007
- fix(docs): fix name of privilege referenced in posts doc by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7002
- fix(SSO): Correctly redirect to originally requested URL in SSO by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7011
- fix(ingest): remove dead code from tests by @hsheth2 in https://github.com/datahub-project/datahub/pull/7005
- feat(ingestion): Tableau # Embed links by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6994
- feat(auth) Update auth cookies to have same-site none for chrome extension by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6976
- docs(website): DPG WIP by @maggiehays in https://github.com/datahub-project/datahub/pull/6998
- docs: resize datahub logo by @hsheth2 in https://github.com/datahub-project/datahub/pull/7014
- fix(kafka-setup): Remove reference to non-existing topic by @pedro93 in https://github.com/datahub-project/datahub/pull/7019
- fix(ingest): powerbi # use display name field as title for powerbi report page by @looppi in https://github.com/datahub-project/datahub/pull/7017
- feat(auth) Allow session ttl to be configurable by env variable by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7022
- fix(ui): URL Encode all Entity Profile URLs by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7023
- fix(ui ingest): Fix test connection when stateful ingest is enabled by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7013
- docs(sso) move root user warning to earlier in SSO guides by @maggiehays in https://github.com/datahub-project/datahub/pull/7028
- fix(ingest/looker): add clarity in chart input parsing logs by @hsheth2 in https://github.com/datahub-project/datahub/pull/7003
- chore(ingest): remove duplicate data_platform.json file by @hsheth2 in https://github.com/datahub-project/datahub/pull/7026
- feat(ingestion): PowerBI # Remove corpUserInfo aspect ingestion by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7034
- fix(metadata-models): remove unnecessary bin folder by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7035
- fixing typos by @maggiehays in https://github.com/datahub-project/datahub/pull/7030
New Contributors
- @marvin-roesch made their first contribution in https://github.com/datahub-project/datahub/pull/6873
- @stijndehaes made their first contribution in https://github.com/datahub-project/datahub/pull/6719
- @ccpypy made their first contribution in https://github.com/datahub-project/datahub/pull/6583
- @LucasRoesler made their first contribution in https://github.com/datahub-project/datahub/pull/6953
- @vishalkSimplify made their first contribution in https://github.com/datahub-project/datahub/pull/6785
- @wngus606 made their first contribution in https://github.com/datahub-project/datahub/pull/6959
- @jaegwonseo made their first contribution in https://github.com/datahub-project/datahub/pull/6943
- @blankon123 made their first contribution in https://github.com/datahub-project/datahub/pull/7007
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.9.5...v0.9.6
What's Changed
- feat(ingest): add pydantic helper for removed fields by @hsheth2 in https://github.com/datahub-project/datahub/pull/6853
- chore(0.9.5): Bump defaults for release v0.9.5 by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6856
- Revert "fix(ci): remove warnings due to deprecated action" by @anshbansal in https://github.com/datahub-project/datahub/pull/6857
- refactor(restli-mce-consumer) by @david-leifker in https://github.com/datahub-project/datahub/pull/6744
- fix(ci): reduce smoke test run time by @anshbansal in https://github.com/datahub-project/datahub/pull/6841
- fix(security): require signed/encrypted jwt tokens by @david-leifker in https://github.com/datahub-project/datahub/pull/6565
- feat(ingest): update profiling to fetch configurable number of sample values by @mayurinehate in https://github.com/datahub-project/datahub/pull/6859
- feat(ingest/airflow): support raw dataset urns in airflow lineage by @hsheth2 in https://github.com/datahub-project/datahub/pull/6854
- refactor(graphql): make graphqlengine easier to use by @anshbansal in https://github.com/datahub-project/datahub/pull/6865
- fix(kafka): datahub-upgrade job by @david-leifker in https://github.com/datahub-project/datahub/pull/6864
- feat(ingest): pass timeout config in kafka admin client api calls by @mayurinehate in https://github.com/datahub-project/datahub/pull/6863
- chore(ingest): loosen requirements file by @hsheth2 in https://github.com/datahub-project/datahub/pull/6867
- feat(ingest): upgrade pydantic version by @cccs-eric in https://github.com/datahub-project/datahub/pull/6858
- fix(elasticsearch): fixes out of order runId writes by @david-leifker in https://github.com/datahub-project/datahub/pull/6845
- chore(ingest): loosen additional requirements by @hsheth2 in https://github.com/datahub-project/datahub/pull/6868
- feat(ingest): bigquery/snowflake - Store last profile date in state by @treff7es in https://github.com/datahub-project/datahub/pull/6832
- docs(google-analytics): Correct grammatical error in README.md by @jx2lee in https://github.com/datahub-project/datahub/pull/6870
- feat(CI): add venv caching by @szalai1 in https://github.com/datahub-project/datahub/pull/6843
- feat(ingest/snowflake): handle failures gracefully and raise permission failures by @mayurinehate in https://github.com/datahub-project/datahub/pull/6748
- fix(runid): always update runid, except when queued by @david-leifker in https://github.com/datahub-project/datahub/pull/6876
- fix(ingest): conditionally include env in assertion guid by @hsheth2 in https://github.com/datahub-project/datahub/pull/6811
- chore(ci): update dependencies docs-website by @anshbansal in https://github.com/datahub-project/datahub/pull/6871
- feat(ui) - Add a custom error message for bulk edit to add clarity by @mkamalas in https://github.com/datahub-project/datahub/pull/6775
- docs(adding users): Refreshing the docs for adding new DataHub Users by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6879
- test(mce-consumer): mockbeans by @david-leifker in https://github.com/datahub-project/datahub/pull/6878
- feat(ingest): avoid embedding serialized json in metadata files by @hsheth2 in https://github.com/datahub-project/datahub/pull/6742
- refactor(gradle): move the local docker registry to common location by @david-leifker in https://github.com/datahub-project/datahub/pull/6881
- refactor(smoke): use env variables by @anshbansal in https://github.com/datahub-project/datahub/pull/6866
- fix(lint): pin pydantic version by @anshbansal in https://github.com/datahub-project/datahub/pull/6886
- refactor(docs): Correctly spell elasticsearch in docs by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6880
- fix(ingest): okta undefined variable error by @anshbansal in https://github.com/datahub-project/datahub/pull/6882
- fix(ci): reduce flakiness in add_users, siblings smoke test by @anshbansal in https://github.com/datahub-project/datahub/pull/6883
- fix(ingest): fall back to default table comment method for all Trino query errors by @marvin-roesch in https://github.com/datahub-project/datahub/pull/6873
- test(misc): misc test updates by @david-leifker in https://github.com/datahub-project/datahub/pull/6890
- deprecate(ingest): bigquery - Removing bigquery-legacy source by @treff7es in https://github.com/datahub-project/datahub/pull/6851
- chore(ingest): remove inferred args to MCPW, part 1 by @hsheth2 in https://github.com/datahub-project/datahub/pull/6819
- test(ingest/kafka-connect): make docker setup more reliable by @hsheth2 in https://github.com/datahub-project/datahub/pull/6902
- fix(ingest): profiling (bigquery) - Address biquery profiling query error due to timestamp vs data mismatch by @treff7es in https://github.com/datahub-project/datahub/pull/6874
- fix(cli): Make datahub quickstart work with latest docker compose in M1 by @pedro93 in https://github.com/datahub-project/datahub/pull/6891
- fix(cli): fix delete urn cli bug + stricter type annotations by @hsheth2 in https://github.com/datahub-project/datahub/pull/6903
- fix(ingest/airflow): reorder imports to avoid cyclical dependencies by @stijndehaes in https://github.com/datahub-project/datahub/pull/6719
- feat: remove jq requirement + tweak modeldocgen args by @hsheth2 in https://github.com/datahub-project/datahub/pull/6904
- chore(ingest): loosen pyspark and pydeequ deps by @hsheth2 in https://github.com/datahub-project/datahub/pull/6908
- docs(ingest/looker): fix typos + update lookml github action example by @hsheth2 in https://github.com/datahub-project/datahub/pull/6910
- fix(ingest/metabase): use card_id in dashboard to chart lineage by @ccpypy in https://github.com/datahub-project/datahub/pull/6583
- fix(es-setup): create data stream on non-aws by @szalai1 in https://github.com/datahub-project/datahub/pull/6926
- Adding missing Platform logos by @maggiehays in https://github.com/datahub-project/datahub/pull/6892
- feat(ingestion): PowerBI# Improve PowerBI source ingestion by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6549
- Fix compose context for kafka-setup by @szalai1 in https://github.com/datahub-project/datahub/pull/6923
- feat(backend): Supporting Embeddable Previews for Dashboards, Charts, Datasets by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6875
- chore(deps): bump json5 from 2.2.1 to 2.2.3 in /docs-website by @dependabot in https://github.com/datahub-project/datahub/pull/6930
- chore(deps): bump json5 from 1.0.1 to 1.0.2 in /datahub-web-react by @dependabot in https://github.com/datahub-project/datahub/pull/6931
- fix(ci): managed ingestion test fix by @anshbansal in https://github.com/datahub-project/datahub/pull/6946
- feat(ingest): add
include_table_location_lineage
flag for SQL common by @hsheth2 in https://github.com/datahub-project/datahub/pull/6934 - feat(ingest): allow extracting snowflake tags by @frsann in https://github.com/datahub-project/datahub/pull/6500
- chore(ingest): unpin pydantic dep by @hsheth2 in https://github.com/datahub-project/datahub/pull/6909
- chore(ingest): partially revert pyspark dep from #6908 by @hsheth2 in https://github.com/datahub-project/datahub/pull/6954
- fix(ingest): use branch info when cloning git repos by @hsheth2 in https://github.com/datahub-project/datahub/pull/6937
- chore(ingest): remove inferred args to MCPW, part 2 by @hsheth2 in https://github.com/datahub-project/datahub/pull/6905
- fix(ingest/unity): simplify MCP generation and reporting by @hsheth2 in https://github.com/datahub-project/datahub/pull/6911
- chore(ci): parallelise build and test workflow to reduce time by @anshbansal in https://github.com/datahub-project/datahub/pull/6949
- fix(frontend): sasl.client.callback.handler.class by @szalai1 in https://github.com/datahub-project/datahub/pull/6962
- chore(react): remove outdated cypress tests and dependency by @anshbansal in https://github.com/datahub-project/datahub/pull/6948
- fix(ci): restrict GE to fix build issues by @anshbansal in https://github.com/datahub-project/datahub/pull/6967
- feat(queries): [Experimental] Allow customization of # of queries in Query tab via env var by @gabe-lyons in https://github.com/datahub-project/datahub/pull/6964
- feat(ingest/postgres): emit lineage for postgres views by @LucasRoesler in https://github.com/datahub-project/datahub/pull/6953
- feat(ingest/vertica): support projections and lineage in vertica by @vishalkSimplify in https://github.com/datahub-project/datahub/pull/6785
- fix(ingest): add missing dep for powerbi by @hsheth2 in https://github.com/datahub-project/datahub/pull/6969
- Docs fixes week of 12 22 by @laulpogan in https://github.com/datahub-project/datahub/pull/6963
- fix(ingest): unfreeze bigquery/snowflake column dataclass by @mayurinehate in https://github.com/datahub-project/datahub/pull/6921
- chore(frontend) Remove unused dependencies from package.json by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6974
- chore: misc fixes by @anshbansal in https://github.com/datahub-project/datahub/pull/6966
- feat(ingest/glue): emit s3 lineage for s3a and s3n schemes by @danielli-ziprecruiter in https://github.com/datahub-project/datahub/pull/6788
- fix(kafka-setup): Make kafka-setup run with multiple threads by @pedro93 in https://github.com/datahub-project/datahub/pull/6970
- feat(ingest): mark database_alias and env as deprecated by @hsheth2 in https://github.com/datahub-project/datahub/pull/6901
- fix(docs): Updating Tag, Glossary Term docs to point to correct GraphQL methods by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6965
- chore(deps): bump certifi from 2020.12.5 to 2022.12.7 in /metadata-ingestion/src/datahub/ingestion/source/feast_image by @dependabot in https://github.com/datahub-project/datahub/pull/6979
- fix(ingest): profiling - Fixing issue with the wrong timestamp stored in check by @treff7es in https://github.com/datahub-project/datahub/pull/6978
- config(quickstart): enable auto-reindex for quickstart by @david-leifker in https://github.com/datahub-project/datahub/pull/6983
- feat(privileges) - Create a privilege to manage glossary children recursively by @mkamalas in https://github.com/datahub-project/datahub/pull/6731
- chore(ingest): finish removing feast-legacy by @hsheth2 in https://github.com/datahub-project/datahub/pull/6985
- feat(ingest): add import descriptions of two or more nested messages by @wngus606 in https://github.com/datahub-project/datahub/pull/6959
- feat(docs) Add feature guide for Manual Lineage by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6933
- docs(rfc): Serialising GMS Updates with Preconditions by @mattmatravers in https://github.com/datahub-project/datahub/pull/5818
- fix(ingest/kafka-connect) support newer version of debezium by @jaegwonseo in https://github.com/datahub-project/datahub/pull/6943
- fix(docs): build and broken snowflake docs fix by @anshbansal in https://github.com/datahub-project/datahub/pull/6997
- fix(ingest): bigquery - views in case more than 1 datasets with views by @anshbansal in https://github.com/datahub-project/datahub/pull/6995
- fix(docs): Renaming Business Glossary Doc by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7001
- fix(ingest/snowflake): fix type annotations + refactor get_connect_args by @hsheth2 in https://github.com/datahub-project/datahub/pull/7004
- fix(docs): Changing the platform event topic name in kafka custom topic docs by @blankon123 in https://github.com/datahub-project/datahub/pull/7007
- fix(docs): fix name of privilege referenced in posts doc by @aditya-radhakrishnan in https://github.com/datahub-project/datahub/pull/7002
- fix(SSO): Correctly redirect to originally requested URL in SSO by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7011
- fix(ingest): remove dead code from tests by @hsheth2 in https://github.com/datahub-project/datahub/pull/7005
- feat(ingestion): Tableau # Embed links by @mohdsiddique in https://github.com/datahub-project/datahub/pull/6994
- feat(auth) Update auth cookies to have same-site none for chrome extension by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/6976
- docs(website): DPG WIP by @maggiehays in https://github.com/datahub-project/datahub/pull/6998
- docs: resize datahub logo by @hsheth2 in https://github.com/datahub-project/datahub/pull/7014
- fix(kafka-setup): Remove reference to non-existing topic by @pedro93 in https://github.com/datahub-project/datahub/pull/7019
- fix(ingest): powerbi # use display name field as title for powerbi report page by @looppi in https://github.com/datahub-project/datahub/pull/7017
- feat(auth) Allow session ttl to be configurable by env variable by @chriscollins3456 in https://github.com/datahub-project/datahub/pull/7022
- fix(ui): URL Encode all Entity Profile URLs by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7023
- fix(ui ingest): Fix test connection when stateful ingest is enabled by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7013
- docs(sso) move root user warning to earlier in SSO guides by @maggiehays in https://github.com/datahub-project/datahub/pull/7028
- fix(ingest/looker): add clarity in chart input parsing logs by @hsheth2 in https://github.com/datahub-project/datahub/pull/7003
- chore(ingest): remove duplicate data_platform.json file by @hsheth2 in https://github.com/datahub-project/datahub/pull/7026
- feat(ingestion): PowerBI # Remove corpUserInfo aspect ingestion by @mohdsiddique in https://github.com/datahub-project/datahub/pull/7034
- fix(metadata-models): remove unnecessary bin folder by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/7035
- fixing typos by @maggiehays in https://github.com/datahub-project/datahub/pull/7030
- feat(ingest): Ingest Previews for Looker Charts, Dashboards, and Explores by @jjoyce0510 in https://github.com/datahub-project/datahub/pull/6941
- fix(graphql):fix issue: autorender aspect could not be displayed on t… by @yangjiandan in https://github.com/datahub-project/datahub/pull/6993
- fix(config): adding quotes by @david-leifker in https://github.com/datahub-project/datahub/pull/7038
- fix(config): adding quotes by @david-leifker in https://github.com/datahub-project/datahub/pull/7040
- fix(ingest/bigquery): Turning some usage warning message to debug log as it caused confusion by @treff7es in https://github.com/datahub-project/datahub/pull/7024
- feat(ingest/vertica): Adding Vertica as source in Datahub UI by @Rajasekhar-Vuppala in https://github.com/datahub-project/datahub/pull/7010
- Removed a double set for two fields by @bda618 in https://github.com/datahub-project/datahub/pull/7037
New Contributors
- @marvin-roesch made their first contribution in https://github.com/datahub-project/datahub/pull/6873
- @stijndehaes made their first contribution in https://github.com/datahub-project/datahub/pull/6719
- @ccpypy made their first contribution in https://github.com/datahub-project/datahub/pull/6583
- @LucasRoesler made their first contribution in https://github.com/datahub-project/datahub/pull/6953
- @vishalkSimplify made their first contribution in https://github.com/datahub-project/datahub/pull/6785
- @wngus606 made their first contribution in https://github.com/datahub-project/datahub/pull/6959
- @jaegwonseo made their first contribution in https://github.com/datahub-project/datahub/pull/6943
- @blankon123 made their first contribution in https://github.com/datahub-project/datahub/pull/7007
- @yangjiandan made their first contribution in https://github.com/datahub-project/datahub/pull/6993
- @Rajasekhar-Vuppala made their first contribution in https://github.com/datahub-project/datahub/pull/7010
Full Changelog: https://github.com/datahub-project/datahub/compare/v0.9.5...v0.9.6
DataHub v0.9.5
Released on 2022-12-23 by @jjoyce0510.
View the release notes for DataHub v0.9.5 on GitHub.
[Known Issues] DataHub v0.9.4
Released on 2022-12-20 by @maggiehays.
View the release notes for [Known Issues] DataHub v0.9.4 on GitHub.
DataHub v0.9.3
Released on 2022-11-30 by @maggiehays.
View the release notes for DataHub v0.9.3 on GitHub.
DataHub v0.9.2
Released on 2022-11-04 by @maggiehays.
View the release notes for DataHub v0.9.2 on GitHub.
DataHub v0.9.1
Released on 2022-10-31 by @maggiehays.
View the release notes for DataHub v0.9.1 on GitHub.
DataHub v0.9.0
Released on 2022-10-11 by @szalai1.
View the release notes for DataHub v0.9.0 on GitHub.
DataHub v0.8.45
Released on 2022-09-23 by @gabe-lyons.
View the release notes for DataHub v0.8.45 on GitHub.
DataHub v0.8.44
Released on 2022-09-01 by @jjoyce0510.
View the release notes for DataHub v0.8.44 on GitHub.
DataHub v0.8.43
Released on 2022-08-09 by @maggiehays.
View the release notes for DataHub v0.8.43 on GitHub.
v0.8.42
Released on 2022-08-03 by @gabe-lyons.
View the release notes for v0.8.42 on GitHub.
v0.8.41
Released on 2022-07-15 by @anshbansal.
View the release notes for v0.8.41 on GitHub.
v0.8.40
Released on 2022-06-30 by @gabe-lyons.
View the release notes for v0.8.40 on GitHub.
v0.8.39
Released on 2022-06-24 by @maggiehays.
View the release notes for v0.8.39 on GitHub.
[!] DataHub v0.8.38
Released on 2022-06-09 by @jjoyce0510.
View the release notes for [!] DataHub v0.8.38 on GitHub.
[!] DataHub v0.8.37
Released on 2022-06-09 by @jjoyce0510.
View the release notes for [!] DataHub v0.8.37 on GitHub.
DataHub V0.8.36
Released on 2022-06-02 by @treff7es.
View the release notes for DataHub V0.8.36 on GitHub.
[!] DataHub v0.8.35
Released on 2022-05-18 by @dexter-mh-lee.
View the release notes for [!] DataHub v0.8.35 on GitHub.
v0.8.34
Released on 2022-05-04 by @maggiehays.
View the release notes for v0.8.34 on GitHub.
DataHub v0.8.33
Released on 2022-04-15 by @dexter-mh-lee.
View the release notes for DataHub v0.8.33 on GitHub.
DataHub v0.8.32
Released on 2022-04-04 by @dexter-mh-lee.
View the release notes for DataHub v0.8.32 on GitHub.
DataHub v0.8.31
Released on 2022-03-17 by @dexter-mh-lee.
View the release notes for DataHub v0.8.31 on GitHub.
Datahub v0.8.30
Released on 2022-03-17 by @rslanka.
View the release notes for Datahub v0.8.30 on GitHub.
DataHub v0.8.29
Released on 2022-03-10 by @shirshanka.
View the release notes for DataHub v0.8.29 on GitHub.
DataHub v0.8.28
Released on 2022-03-07 by @shirshanka.
View the release notes for DataHub v0.8.28 on GitHub.
DataHub Release Candidate v0.8.28 (rc1)
Released on 2022-03-05 by @shirshanka.
View the release notes for DataHub Release Candidate v0.8.28 (rc1) on GitHub.
Release Candidate v0.8.28
Released on 2022-03-05 by @shirshanka.
View the release notes for Release Candidate v0.8.28 on GitHub.