Skip to main content

Cadence Service Releases

Cadence follows semantic versioning and you can find more information about changes between version on the Cadence changelog.

Latest release for General Availability

v1.2.16 published on 2025-02-19

Major Releases

v1.2.x

v1.0.x

v0.25.x

v0.24.x

v0.23.x

v0.22.x

v0.21.x

v0.20.x

v0.19.x


Release v1.2.16 published by neil-xie on 2/19/2025

What's Changed

Full Changelog: https://github.com/cadence-workflow/cadence/compare/v1.2.15...v1.2.16


Release v1.2.15 published by jakobht on 1/22/2025

What's Changed

Added

Changed

Fixed

Removed

New Contributors

Full Changelog: https://github.com/cadence-workflow/cadence/compare/v1.2.14...v1.2.15


Release v1.2.14 published by neil-xie on 11/13/2024

What's Changed

Added

Changed

Fixed

Removed

New Contributors

Full Changelog: https://github.com/uber/cadence/compare/v1.2.13...v1.2.14


Release v1.2.13 published by jakobht on 9/25/2024

What's Changed

Full Changelog: https://github.com/uber/cadence/compare/v1.2.12...v1.2.13


Release v1.2.12 published by neil-xie on 8/19/2024

What's Changed

Full Changelog: https://github.com/uber/cadence/compare/v1.2.11...v1.2.12


Release v1.2.11 published by neil-xie on 7/10/2024

What's Changed

New Contributors

Full Changelog: https://github.com/uber/cadence/compare/v1.2.10...v1.2.11


Release v1.2.10 published by neil-xie on 6/4/2024

What's Changed

New Contributors

Full Changelog: https://github.com/uber/cadence/compare/v1.2.9...v1.2.10


Release v1.2.9 published by jakobht on 5/1/2024

What's Changed

New Contributors

Full Changelog: https://github.com/uber/cadence/compare/v1.2.8...v1.2.9


Release v1.2.8 published by neil-xie on 3/26/2024

What's Changed

Added

Changed

Fixed

  • Set proper max reset points by @neil-xie in #5623
  • Put a timeout for timer task deletion loop during shutdown by @taylanisikdemir in #5626
  • Catch unit test failures in make test by @Groxx in #5635
  • fix: get messages between query over message_id typo by @zedongh in #5607
  • Fix context leak in tests by @munahaf in #5377
  • Make sure task processing rate limiter is only done in the active side by @sankari165 in https://github.com/uber/cadence/pull/5654
  • Fix Pinot query validator bug when user pass in not equal query with value missing by @neil-xie in #5662
  • Update Pinto query validator failed log, minor refactor pinot visibility store to remove panics by @neil-xie in https://github.com/uber/cadence/pull/5664
  • Fix context leak in pinot integration test by @neil-xie in #5682
  • Fix SignalWithStartWorkflow API by @Shaddoll in #5671
  • Fix wrong migration paths in example by @kotcrab in #5668
  • Fix comment in workflow id cache config by @sankari165 in #5661
  • Fix the local integration test docker-compose file by @jakobht in https://github.com/uber/cadence/pull/5695
  • Do not get workflow execution from database when shard is closed by @Shaddoll in https://github.com/uber/cadence/pull/5697

Removed

New Contributors

Full Changelog: https://github.com/uber/cadence/compare/v1.2.7...v1.2.8


Release v1.2.7 published by neil-xie on 2/9/2024

What's Changed

Added

Fixed

Changed

Removed

New Contributors

Full Changelog: https://github.com/uber/cadence/compare/v1.2.6...v1.2.7


Release v1.2.6 published by neil-xie on 12/14/2023

What's Changed

Added

  • Added range query support for Pinot json index by @bowenxia (#5426)
  • Implemented GetTaskListSize method at persistence layer by @Shaddoll (#5442, #5447)
  • Added a framework for the Task validator service by @agautam478 (#5446)
  • Added nit comments describing the Update workflow cycle @agautam478 (#5432)
  • Added log user query param by @bowenxia (#5437)
  • Added CODEOWNERS file by @taylanisikdemir (#5453)
  • Added a function to evict all elements older than the cache TTL by @jakobht (#5464)

Fixed

  • Fixed workflow replication for reset workflow by @Shaddoll (#5412)
  • Fixed visibility mode for admin when use Pinot visibility by @neil-xie (#5441)
  • Fixed workflow started metric by @ketsiambaku (#5443)
  • Fixed timer-fixer, unfortunately broken in 1.2.5 by @Groxx (#5433)
  • Fixed confusing comment in matching handler by @jakobht (#5450)

Changed

  • Cassandra version is changed from 3.11 to 4.1.3 by @taylanisikdemir (#5461)
    • If your machine already has ubercadence/server:master-auto-setup image then you need to repull so it works with latest docker-compose*.yml files
  • Move dynamic ratelimiter to its own file by @jakobht (#5451)
  • Create and use a limiter struct instead of just passing a function by @jakobht (#5454)
  • Dynamic ratelimiter factories by @jakobht (#5455)
  • Update github action for image publishing to released by @3vilhamster (#5460)
  • Update matching to emit metric for tasklist backlog size by @Shaddoll (#5448)
  • Change variable name from SecondsSinceEpoch into EventTimeMs by @bowenxia (#5463)

Removed

  • Get rid of noisy task adding failure log in matching service by @taylanisikdemir (#5445)

New Contributors

Full Changelog: https://github.com/uber/cadence/compare/v1.2.5...v1.2.6


Release v1.2.5 published by sankari165 on 11/2/2023

What's Changed

Added

  • Scanner / Fixer changes by @Groxx in https://github.com/uber/cadence/pull/5361
    • Stale-workflow detection and cleanup added to shardscanner, disabled by default.
    • New dynamic config to better control scanner and fixer, particularly for concrete executions.
    • Documentation about how scanner/fixer work and how to control them, see the scanner readme.md
    • This also includes example config to enable the new fixer.
  • MigrationChecker interface to expose migration CLI by @abhishekj720 in https://github.com/uber/cadence/pull/5424
  • Added Pinot as new visibility store option by @neil-xie in https://github.com/uber/cadence/pull/5201
    • Added pinot visibility triple manager to provide options to write to both ES and Pinot.
    • Added pinotVisibilityStore and pinotClient to support CRUD operations for Pinot.
    • Added pinot integration test to set up Pinot test cluster and test Pinot functionality.

Fixed

Full Changelog: https://github.com/uber/cadence/compare/v1.2.4...v1.2.5-prerelease3


Release v1.2.4 published by neil-xie on 9/27/2023

What's Changed

Full Changelog: https://github.com/uber/cadence/compare/v1.2.3...v1.2.4


Release v1.2.2 published by sankari165 on 9/19/2023

What's Changed

Full Changelog: https://github.com/uber/cadence/compare/v1.2.1...v1.2.2


Release v1.2.1 published by davidporter-id-au on 9/19/2023

Project release: Zonal isolation

This version introduces a few resiliency concepts into customers' worker task processing such that they can detect deployment or configuration failures earlier. These features are opt-in.

The high-level concept is to provide a means to subdivide work (called 'isolation-groups') for workers along whatever partitioning mechanism that is required for your service.

By default the partitioning mechanism provided will attempt to keep workflows running in the location the are started, such that customers may identify broken changes earlier, rather than waiting for the deployment of an entire region. However, if there are no pollers available available in that subdivision, it'll route the work elsewhere.

Nomenclature

Partitioning: A means to subdivide the tasks given to workflows, of which there are many possible schemes and one default one provided. When a workflow is started, a group of partition keys are provided by request headers. The partition keys are used to determine which isolation group of workers should process these workflows. Workflow pinning: A partitioning scheme which emphasizes keeping workflows running in the location they were started Isolation-groups: A division of work within a customer region in which they can subdivide their workers and pin the workflows. This originally was intended as a synonym for 'zone' in the site reliability, as a subdivision of a region. However the important point is that this is a failure domain for customer workflows, so this may be an arbitrary subdivision of your cluster's traffic. Isolation-group drain: A means of excluding work from an isolation-group. If an isolation group is drained, workers from that isolation group won't be able to get any task. And customers cannot start workflows from that isolation group.

Default concepts and approaches

The partitioning and isolation concepts are intended to be provided as general purpose orchestration concepts and flexible, with some basic defaults provided. By default the following behaviour is given:

  • Partition data is persisted with workflow execution records by the provided middleware if the provided header is passed when workflows are created.
  • The cadence client and worker Go libraries will pass these as headers if provided in client options

Pinning behaviour

The workflow original zone is captured on workflow start and will be used on workflow processing.

The default partitioner provides the following behaviour: It will attempt to dispatch work in a zone where the workflow was started. However, workers may not be available in that zone, or no longer available for some reason. So the partitioner takes information from a lookback of poller information and uses this lookback data to ensure that the workflow can be processed. If the the start isolation-group is not available it'll another healthy random one.

'Health', here, is determined as the presence of pollers and the absence of drains.

The 'unpinning' is import for two main reasons: firstly, it's quite possible to start a workflow from an unrelated isolation-group in which the pollers are created and to suddenly blackhole that work would likely be not the desired behaviour. But secondly, and probably more importantly, this prevents a head-of-line blocking problem internally for Cadence. At the database level (in this release anyway) tasks need to be dispatched in-order and so if an isolation-group were to be not processed it would block task processing.

Drains

This release also introduces a simplistic notion of drains, which allow for isolation-groups to be excluded from traffic processing, should that be required. Drains are issuable via the Admin API or via cli:

eg:

cadence admin isolation-groups update-global --set-drains zone-1 cadence admin isolation-groups get-global

This information is stored in the config-store and is not part of dynamic configuration.

Configuration

In order to use this feature, the requisite configuration is required:

system.allIsolationGroups: This is a list of all the possible isolation-groups system.enableTasklistIsolation: This is the bool flag to enable it for a domain

Implementation

The changes for this feature are largely in Matching and can be (reductively) described as: Sync and Async-match in Cadence as being made aware of a new dimension; their associated isolation-group. The tasks piped through the Matching service are matching the appropriate isolation-group channel.

What's Changed

New Contributors

Full Changelog: https://github.com/uber/cadence/compare/v1.0.0...v1.2.1


Release v1.2.3 (Retracted, please use v1.2.4) published by Shaddoll on 9/15/2023

Added

Expose workflow history size and count to client by @timl3136 (#5392)

Fixed

[cadence-cli] fix typo in input flag for parallelism by @sankari165 (#5397)

Changed

Update config store client to support SQL database by @Shaddoll (#5395) Scaffold config store for sql plugins by @Shaddoll (#5396) Improve poller detection for isolation by @Shaddoll (#5399)


Release v1.0.0 published by Groxx on 4/26/2023

We are v1.0! (with a schema upgrade)

What does this mean?!

Not much. Primarily that we are declaring "it's stable and in use" more visibly, because we continually get questions about this :) A larger public announcement / state-of-the-project is in the works.

Importantly, v1.0 does not imply any change to backwards compatibility (the minimum supported client version has not changed), RPC compatibility (ditto, all changes are backwards compatible), or Go API compatibility (this is not truly a library, Go compatibility is not a goal).

Going by previous version patterns, this would have been labeled v0.26.0 as it is a relatively incremental change (plus schema changes) from v0.25.0. As such, some strings still reference "0.26", because this older SHA is the one we have been using the most internally.
These strings will be updated and validated soon, and will likely be released as v1.0.1. This should have no behavioral impact at all, but will be visible in metrics, logs, and display strings.

What do I need to do to upgrade?

Schema upgrades needed

There have been schema changes to both normal and visibility datastores, primarily to provide better data for cleanup and hot-shard detection:

These were intentionally kept out of v0.25.0 to keep that upgrade simple, as they were not fully utilized yet.

Replication cache recommendation

We have internally disabled the replication cache (history.replicatorCacheCapacity dynamic config set to 0), due to unexpectedly large memory use under abnormal load, and you may wish to do so as well.

We did not encounter any misbehavior, and it did reduce database load as intended, but we intend to make some changes to it to estimate and constrain memory use before re-enabling.

What has changed?

At a very high level, we've been focused on:

  • Internal scaling challenges, both improving bottlenecks and improving our ability to accurately identify bottlenecks
    • Many metrics, logs, and refactors are at least somewhat related to this
    • Our multi-cluster support is improved in particular, as we have been connecting clusters and moving many domains to spread load more evenly
  • Database corruptions, as our Cassandra clusters have had some problems that cause issues for months
    • Many logs, scanner, and stale-task changes are related to this, e.g. to detect and remove invalid data
  • Scaling up the team
    • More changes to come!

Some loosely categorized PRs that were included follows:

Critical bugfixes (resolving issues in v0.25.0)

Parent-close-policies apply to child workflows even after they reset/continue-as-new/etc

  • Update parent close policy to terminate/cancel child workflows even after continue as new by @Shaddoll in https://github.com/uber/cadence/pull/5032
    • This requires new stored data, so it does not apply to child workflows started before this version.

Better config introspection

Schemas are now available via the go module, as go:embed files

Enhancing existing metrics and logging (and more included in other PRs)

Misc

New Contributors

Full Changelog: https://github.com/uber/cadence/compare/v0.25.0...v1.0.0


Release v0.25.0 published by Shaddoll on 1/31/2023

Important Notice: If you're experiencing OOM after deploying this version, please update this dynamic property to disable replication cache.

history.replicatorCacheCapacity:
- value: 0

Per-domain metrics

  • 483a1492d Introduce per domain metrics (#5012)
  • e87bd74da Added logs for domainName empty situation (#4987)
  • c8783f0b3 Addition of domainName tag to Replication task (#4975)
  • 88991f2ff Addition of domain tag for Replication task metric (#4974)
  • e69dbd6a6 Added changes to readHistoryBranchRequest (#4972)
  • 76a025a7a Added domainName change to remaining functions of appendHistoryNodeRequest and RecordWorkflowExecutionUninitializedRequest (#4968)
  • 0f590423c Added changes to archival client (#4958)
  • d1965b1ab Added domain Tag to UpdateTaskList,DeleteTaskList,LeaseTaskList,CompleteTask and CompleteTaskLessThan (#4950)
  • 4c8013d76 Added changes to GetTask and CreateTask (#4947)
  • e88a9c7ad Added changes to PutReplicationTaskToDLQ and IsWorkflowExecutionExists (#4946)
  • b9b8b42b9 Added changes to DeleteCurrentWorkflowExecution and GetCurrentExecution (#4944)
  • 8c5f2ffb4 Added changes to ConflictWorkflowExecution and DeleteWorkflowExecution (#4943)
  • 13a130be7 Added changes to GetWorkflowExecution and UpdateWorkflowExecution (#4938)
  • 2bb13a17d Added DomainTag changes to ReadHistory branch for readHistoryRequest, CreateWorkflowRequest + added DomainCacheNoOp file (#4930)
  • c091a4960 Changed DeleteHistoryBranch and GetHistoryTree by adding Domain Tag with mocks (#4928)
  • b34f4e4b9 Adding DomainTag to the ForkHistoryBranch, ReadRawHistoryBranch and ReadHistoryBranchByBatch (#4926)
  • 6cf4252d4 Adding DomainTag to the Persistence metrics client (#4922)
  • c3f7bd347 Addition of DomainTag to required functions for the creation of metrics required for Domain Cost Attribution (#4908)

Replication improvement

  • 62428546f Immediate replication task hydration after successful transaction (#4980)
  • beaf67011 Return early when there are not replication tasks (#4982)
  • d38b08e45 Add Metric Emitter, which emits a metric once a minute for true replication lag in nanoseconds. (#4979)
  • 1a2804dc7 Reduce metrics cardinality for replication.TaskStore (#4981)
  • 93a6f2348 Return persisted history events blob (#4953)
  • 1be9b6d6a Replication cache for sharing hydrated messages (#4952)
  • 457c35e4f Partial response of GetReplicationMessages on history service (#4935)
  • d739bf5f6 Helpers for getting enabled and remote cluster info (#4951)
  • 385c1c368 Adds more pertinent information about replication (#4931)
  • fe3bf0b6b Refactor task ack manager (#4894)
  • 83aa1938a Removed TaskID from types.HistoryTaskV2Attributes (#4876)

Observability improvement

  • 1e788db68 Add domain_type and cluster_groups tags (#4990)
  • ff113929f Improve logs for task executor (#4989)
  • e597b8724 Add logs to debug transfer task (#4970)
  • 177f08713 Improve log for transfer task validator (#4961)
  • b0d1f06e8 Capture CassandraLWT error and log/bump metrics for it. (#4888)
  • 50d331a4b add activity info logging (#4867)
  • 93bda8f59 \adence-history does not emit continue-as-new metrics (#4866)
  • 7854f812d Add empty response metrics for read operations (#4855)
  • 471e6d164 Log replication messages that did not fit (#4844)
  • b03d03e77 add metric tags for activity task disaptch (#4821)
  • d21162d22 Add logs for domain failover (#4810)
  • 400bbe46b Improve failover coordinator error logging (#4811)
  • a51b61349 Log error fields as tags (#4801)
  • c59865478 Improve task re-dispatch error logging (#4809)
  • 22f97c80b Log error when fetchHistoryFromRemote fails (#4807)
  • 33edece45 Add source_cluster tag when emitting DLQ size (#4782)

Activity dispatch optimization

  • 52203abc1 count local and server optimized activity dispatches as started (#4901)
  • bafdf15b1 do not wait for activity task channel if sync match from history (#4860)
  • 361edb68d add activity dispatch configs to matching (#4818)
  • e77b43dd0 add activity dispatch configs (#4816)
  • 2b0b03f69 updated idl for activity task dispatch (#4815)
  • 2890600be add data contract for activity task dispatch (#4813)
  • cda6c5324 set EnableActivityLocalDispatchByDomain default value to true (#4788)

Restart workflow

  • e5036ed7c CDNC-1781 Add restart command/api (#4900)

Cross Cluster operations

  • e5ed7f726 Feature/adding canary for cross cluster -> readme patch (#4870)
  • 68fb2e60d Adds cross-cluster canary (#4868)

Corrupted workflows

  • 79437b3d3 Introduce a dynamic config for cassandra all consistency level delete (#5000)
  • 052d77c59 Update Cassandra deletes to use ALL consistency level (#4984)

Cancel workflow

  • add4b390a Standardizing cancellation behavior: a canceled workflow never starts a new run (#4898)
  • f1c557870 adding reason to cancel workflow. (#4934)

Failover lockdown

  • 147172c1d Feature/cdnc 2263 Add toggle which can block domain failovers (#4786)

Bug fixes

  • c2ffb71dd Adds fix for domain ack level issue (#5001)
  • 3985fec96 Fix history corruption check for workflow signaling (#4998)
  • 1375e49ca Revert "Fix error conversion for WorkflowExecutionAlreadyStartedError (#4838)" (#4999)
  • 494f202d6 Fix status check for visibility and archival (#4864)
  • a7270495a Bugfix/correct failover issue target domain not active ii (#4840)

Misc improvements & updates

  • 78a755c7a Add new unit test (#5008)
  • 278a3b8a0 Re-enable workflow test (#5007)
  • 43c9ebc5f Fix Cadence CLI (#5005)
  • 146bc31b3 Update idls (#4997)
  • 6da9676b5 Convert client peer resolving errors to service transient errors (#4993)
  • a91a250ef Adding first scheduled time metadata field for cron workflows. (#4969)
  • 5eb67d147 Make test now passes locally (#4915)
  • 3aaa1e8e8 Allow docker compose to work with docker-compose-mysql.yml on M1 (#4983)
  • 854fc59f4 Run docker build on commits, to prevent docker build from breaking in the future (#4978)
  • 172abd6f4 Fix docker build. (#4977)
  • 701fb7061 Adding limit for amount of pending activties in mutable state. (#4959)
  • 6ecd1e4e7 Fixing test. (#4941)
  • d8cb61eb8 Upgrade Golang base images to remediate CVEs (#4957)
  • f2b210821 Simplify shard write operations (#4955)
  • 9949a22c1 Simplify history engine task read ID logic (#4949)
  • 756601890 fix funcorder linter (#4942)
  • b21f34f8a add funcorder linter (#4939)
  • e3496a308 Add List*Execution (ElasticSearch) API ratelimiters (#4925)
  • 85e0fee1f Fix flacky QueryWorkflow tests (#4932)
  • 341d9f081 Improve decode_thrift output (#4929)
  • a4d77f547 Fix query workflow high latency after a long inactive time (#4871)
  • 43a17d2f6 downgrade testify to fix monorepo (#4918)
  • ef8d11e33 Update revive to catch more defer/recover badness (#4917)
  • 82544de0c Replace unsafe usage of recover() in helper functions (#4913)
  • c06649e60 Fix remaining server lint warnings and make lint error by default. (#4911)
  • 8b42a6dcc Start fixing server lint warnings (#4909)
  • d2f72d88d Fix flaky retrypolicy tests. (#4905)
  • 25e221bcf Add new CI step for lint validation (#4903)
  • 64cb46fb9 Add new es record for uninitialized workflow execution (#4899)
  • 8c449b316 Add JitterDelay option when creating workflows. (#4886)
  • 1f8c93a91 reduce MatchingActivityTaskSyncMatchWaitTime default value (#4897)
  • 7da6bc024 [codegen] introduce gowrap for generating retryableClient (#4879)
  • ed2beb20f Separating tools dependencies from main dependencies (#4895)
  • de0992686 Minor makefile cleanup, verbose CI, fmt with a recent Go version (#4896)
  • cfd637e26 add mockery to go generate (#4887)
  • 6f9e2d9c3 upgrade go version to 1.17 in go mod and Buildkite dockerfile (#4889)
  • 663a041c9 Added support for network topology strategy (#4875)
  • ac107606e Move visibility operation from search attributes to indexer message (#4881)
  • 691bf3f82 Magically speed up integration tests by nearly 10x (#4892)
  • e9915ae66 Rename dockers default cluster name to match the other config files. (#4885)
  • aff5ecf6a Simplified FindFirstVersionHistoryByItem (#4882)
  • 4cfb74142 fix flaky TestDelayStartWorkflow (#4884)
  • 9f2190050 update generated code (#4880)
  • 600904405 Support allowed authenticators in tool (#4873)
  • f133d3c58 Add support for changing the gocql connect timeout (#4874)
  • dc5230f44 Update idl for StickyWorkerUnavailableError (#4869)
  • 9e6d122a7 Used exposed admin proto IDLs (#4865)
  • 093030526 Add visibility operation types to Kafka message (#4828)
  • ae1441294 Move some proto definitions to admin package (#4861)
  • af932bd81 Fix CLI rendering long workflow types (#4853)
  • b457b553e Make cluster.Metadata a struct and stop using mocks for it (#4851)
  • 12d8c5412 Add UpdateFromConfig function to schema tool library (#4848)
  • d6ae27853 Decouple domain cache entry from cluster metadata (#4847)
  • 15267b96f Separate buildkite pipeline for PRs (#4850)
  • 0582a58a8 Update SQL implementation of UpdateExecution to support async transaction (#4792)
  • 535cda845 Remove unused loggers from history (#4822)
  • 915a777c9 Simplify history builder (#4837)
  • beab75c6f Removing target-domain-not-active special-case handling (#4835)
  • a57590894 Extract Engine from matching handler (#4833)
  • 20329a2b7 Forward activity responses and heartbeats on failover as well (#4823)
  • fbfafb9f5 Update PROPOSALS.md (#4831)
  • 94fd0a65d Update roadmap.md (#4829)
  • 0a37a8b47 remove redundant type conversions for activity task dispatch (#4820)
  • ee5461b7c Check for resurrected activities during RecordActivityTaskStarted (#4806)
  • 4194b291d Remove unused PayloadSerializer param (#4827)
  • 45770c2e3 Add CustomDomain and Operator as default indexed keys (#4825)
  • eede46696 Fill domainID for backwards compatibility (#4819)
  • 8b100632b Fix error type returned from GetWorkflowExecution and DeleteWorkflowExecution (#4817)
  • fc9d5faec Change access dienied error type (#4808)
  • e91a5a7e4 Allow decoding thrift from base64 string via CLI (#4805)
  • 5be511b8d Update base image to Alpine 3.15 (#4804)
  • e1aaeb76e fix WriteFile fail err hidden by Close invaildargment (#4744)
  • 21537697c Minor makefile cleanup, gofmt (#4802)
  • 6ea8658c3 Only update maxReadLevel after successful re-acquire of shard (#4799)
  • b49002da7 Add jittered workflow deletion configuration (#4789)
  • 91579a15d Fix docker prometheus config for linux docker (#4793)
  • 7a1fe537d Wrap underlying cause for conditional update error (#4797)
  • 480c733d3 Double inline archival time limit defaults (#4796)
  • 8c6164192 Use errgroup.Group for fanout style workfloads (#4784)
  • 8845d979c Update EnableRecordWorkflowExecutionUninitialized flag to filter by domain name (#4904)
  • 807a0e289 Added API for retrieving DLQ message count (#4787)
  • ba4a5d951 Support refreshing long running workflows based on user config (#4770)

Cleanup & Refactoring

  • cb7987640 Drop dynamic config for gRPC message size (#5002)
  • 354e6b07c Removed replication mocks (#4883)
  • 608bcb5d6 Remove unused functions from TaskAckManager (#4872)
  • 9d4524183 Add helper function to list all dynamic config keys used in production (#4891)
  • 650cf8aff Refactor dynamic config (#4863)
  • e8a06cc3d Update the default values of dynamic config to not depend on static config (#4858)
  • 2408f9dd0 Removed unused internal type getters (#4852)
  • c6ce73249 Removed global domain enabled config (#4845)
  • 3a813e850 Remove domain cache from history/workflow (#4846)
  • 3cfcaeab5 Remove no-longer used dynamic configs (#4843)
  • 856d33fb2 Shard tag not needed in shard.Context (#4842)

Release v0.24.0 Release published by vytautas-karpavicius on 5/24/2022

Schema upgrades (required)

Cassandra: upgrade schema to 0.33

288e935d6 Persist domainID instead of domainName for childExecutionInfo (#4601) d9e5003c3 Handle applyParentClose target domain failover (#4533)

gRPC Support

  • Internal traffic is now on gRPC by default
  • Cadence canary is now on gRPC
  • Cadence worker is now on gRPC
  • Cadence CLI supports --transport to use gRPC (default is still tchannel)
  • Added support for TLS

028c44464 Update cadence go client to 0.19.0 (#4696) fd510e170 Export ResponseInfoMiddleware & InboundMetricsMiddleware (#4680) 386536176 Switch system worker to gRPC (#4679) ff71ae356 Shuffle responses for replication messages (#4652) ad49ea6db Fix ResponseInfo to work on all transports (#4649) 5ccad581e Use generated proto types from cadence-idl repo (#4630) 86981576d Add inbound header forwarding middleware (#4637) 9237acb18 Use direct outbound for matching client. (#4622) 04cd354f8 Use direct outbound for history client (#4619) 851081651 Add TLS support on gRPC (#4606) 2af324659 Handle error case in response info middleware (#4609) 53833a21a Fix and improve canary thrift config and docs (#4580) a0ccc85fc Switch canary to gRPC (#4570) b21e5e015 Remove dispatcher provider (#4559) b2037bfc4 Removed frontend client randomisation (#4558) aa9e7a5ce Fix public client default value after xdc switching to gRPC 9ff3eb35f Added cross DC outbound builder (#4552) 7a6b851a2 Remove unused NewFrontendClient functions (#4553) f2f859bcc Move out dispatcher from client factory (#4506) 37a8fd767 Add inbound metrics middleware (#4545) 9db1a61cd Added combineOutbounds to combine multiple outbound builders (#4538) b1e3001f7 Use common dispatcher for public client outbound #2 (#4537) d19cae112 Revert "Use common dispatcher for public client outbound (#4523)" (#4534) a094a330b Use common dispatcher for public client outbound (#4523) 844181fb2 Add size checker when replication messages return (#4521) 9ba3b991b Added response info middleware (#4522) f3e389769 Move out auth middleware and add test coverage (#4519) 580c448dc Introduce rpc.Params (#4517) e45753acc Refactor PeerChooserFactory out of DispatcherProvider (#4508) a53f4c957 Move dispatcher provider to rpc package (#4507) 0b2107fdc Moved RPC related types to a dedicated package (#4505) 5846821cb Use gRPC outbound by default for internal traffic (#4492)

Membership changes

Pluggable membership information provider with extended host metadata.

d3e03c240 Ringpop: set tchannel port even if label is missing (#4765) f65fecb91 Ringpop: filter out unhealthy nodes (#4764) 4dab59a9f Use named port to select transport for outbound calls (#4749) 9b5071757 Provide portmap to ringpop (#4745) 8477b11ef Return Hostinfo identity if set (#4739) acff10cf6 Add correct Address tag (#4736) 29874b61f Lock membership keys after peer provider call (#4733) 7e3d48ccf Protect membership member keys concurrent access (#4731) 45bc726dc Hashring: return Hostinfo struct instead of string (#4708) 7a17a3032 Extend Hostinfo with identity and port map (#4706) 770e9ec4d Replace Ringop with PeerProvider interface (#4653) 3557eb5dd Merge membership Monitor and ServiceResolver to membership.Resolver (#4646) 97f169052 Reduce API scope for membership.Monitor (#4644) 7e141025b Move ringpop setup to common/membership (#4638) c145ab861 Remove Membership Factory (#4627) e15f181ad Support DNS SRV Records within Ringpop (#4614) 9a072ca10 Provide Channel for Ringpop (#4597)

Cross Cluster operations

Cross cluster domain dependency support for signals, child workflows, cancels and parent close policy (pre feature release).

b7d2c7764 Generate parentClosePolicy task for x-cluster child (#4682) c894177bc Improve cross cluster components shutdown logic (#4662) 624a1fcee Bug fixes for cross domain operations (#4623) d3d06825a Add domain to pendingChildExecutionInfo (#4611) 39bebb464 Fix target domain not active error handling for transfer task (#4599) e2b8e9496 Split transfer close execution task (#4583) 0bfd2f701 Schedule first decision for abandoned child if parent closed (#4579) a9ed73af2 Add admin respond cross cluster task completed API (#4565) 5879fa346 Misc. fix for cross cluster implementation (#4554) 041061cf2 Wire up cross-cluster operation implementation (#4524) 70cf8bef0 Add metrics for cross cluster implementation (#4527) 898aa9128 Improve close execution task for cross cluster situation (#4528) f74b915cd Execution logic for RecordChildCompletion and ApplyParentClosePolicy (#4474) f2ff1c3b7 Refactor cross cluster queue implementation (#4493) 52c8acc4c Limit batch size for fetching cross cluster tasks (#4487) 5ac1940c2 Fix parent close policy for cross-domain childs (#4486) bd7072c42 Implement xcluster source task executor (#4445) 6101ab249 Implement cross-cluster source task (#4398) c8f3c1cdb Support ApplyParentClosePolicy Cross Cluster Tasks (#4392) c8d0838d1 Set completed workflow current version to lastWriteVersion (#4431) fb8e78277 Add feature flag for scheduling cross-cluster operations (#4424)

Auto-Forwarding

95402363d Update auto-forwarding to work for global domains with 1 cluster (#4681) 3ee1178d9 Update batcher to support replicating workflows (#4672) 06891aa5c Add Redirect policy to forward all domain APIs

ParentClosePolicy for child workflows only

c7727c080 Parent close policy should apply to child workflow only (#4612)

ES Analyzer

80700d8c1 Add long running workflow metrics (#4643) 2fa278740 ElasticSearch Analyzer (#4598)

MongoDB Support

46b84be58 Implement MongoDB plugin Part1: skeleton and ConfigStore (#4590)

IPv6 Support

064378890 feat: Fixing RPC to allow bindOnIP for IPv6 (#4620)

SQL Support

085a79937 Perform schema checks for multiple SQL database and add context to AdminDB DDL interface (#4561) f182b8710 Unify mysql user password for testing (#4589) 75b10a5f8 Fix mysql insecure hostname verify didn't work (#4569) f5ce7cb60 Implement sharded SQL driver to support using multiple SQL databases (#4504) 90e2290d2 Refactor to add a SQL driver layer for multiple SQL databases support as sharded SQL (#4498)

Auth

334d51fa4 add workflow type to signal with start auth (#4495) f98bd06ef add enable service auth logging key (#4480) b22df4152 extend permission attributes for service auth (#4468) 7aca82984 Load OAuth credential on startup instead of request processing (#4442) 4c2bcc75f Fix OAuth sample config and add docker-compose for OAuth testing server 5191468f2 Adding middleware to inject Auth token for internal requests to frontend (#4364)

Graceful failover

0c3db563a Integrate failover into into describe domain response (#4440) 920077cc0 Adding debug metrics in domain callback (#4484) 6ee5f9378 Add getFailoverInfo API (#4408)

Refresh Tasks API

417f1503c fixed refresh workflow tasks (#4750) 612373147 add refresh tasks API to client (#4747) a5c527f71 Allow generating workflow tasks if workflow is non-current (#4688)

Corrupted workflows

69805085a Add Watchdog Workflow with Corrupt Workflow Fix (#4713) e13da5812 Add fixer workflow triggered by remote (#4482) 1cc94d502 Add a step to scan workflow to be in DLQ (#4471)

Activity dispatch optimization

de0653f2e add metric tags for activity task disaptch (#4821) 3581be59d remove redundant type conversions for activity task dispatch (#4820) ac8cbbd46 add activity dispatch configs to matching (#4818) 532da7191 merged activity dispatch config f5cfeaf7b add activity dispatch configs (#4816) c4713d202 updated idl for activity task dispatch (#4815) b4f38d060 add data contract for activity task dispatch (#4813)

Cadence CLI Changes

b445012a5 Improved CLI DLQ read command (#4780) 950f5ac02 Added --format flag to render table, json or custom template (#4777) c833c98cf Use RenderTable for the remaining CLI commands (#4774) 99fcca82a Allow loading service config for all DB operations (#4768) 0557c2b50 Added presentation layer for rendering workflow list tables (#4773) 9d6589984 Allow reading shard list from stdin for CLI DLQ operations (#4771) 5511bd618 Drop unused flags for cli rereplicate command (#4728) ceacad005 Fix NPE when observing history in CLI (#4714) 953014388 Update CLI client factory to use grpc clients (#4605) 38d1e2a13 Add exclude query for list and reset-batch command (#4699) cf21c869c Add skipCurrentCompleted option to reset-batch command (#4698) 41c89238d Update domain describe command to support JSON output (#4674) c22095099 Fix admin db thrift decode tool (#4665) 75a992a47 Create ElasticSearch client via factory (#4660) 9d40c45d2 Add admin tool to decode any thrift binary into JOSN (#4634) a370de048 Cli: notify on SIGINT (#4615) 8c9db1879 Expose GetTaskListByDomain in CLI (#4462)

Bug fixes

0c8a0fdff Fill domainID for backwards compatibility (#4819) 6981b1d85 Only update maxReadLevel after successful re-acquire of shard (#4799) 7328473c4 Fix ScanWorkflowExecutions function in frontend client (#4781) 13f9cf8b3 Added missing mapper fields for DecisionTaskTimedOutEventAttributes (#4762) 1923121f3 Fix auto-forwarding for QueryWorkflow API (#4763) f1a09839e Fix data conversion from serialization.WorkflowExecutionInfo to persistence.InternalWorkflowExecutionInfo (#4758) 059669895 Use setupBackoffTimer with locking (#4748) b0da1be3b Fix SQL implementation of DeleteWorkflowExecution (#4746) 19a8526f3 Update cadence batch command to receive more input (#4725) edf4cb423 Fix parsing domain_id in child_info_maps for backward compatibility (#4722) dea642993 Fallback to zero value for initiatedID in exteralWorkflowExecutionFields struct (#4720) 27a0df202 Add decision offset to LastDecisionCompleted reset type (#4700) e8fdcd91a Fix cassandra plugin nil pointer dereference issue (#4697) 027bbd6c6 Fix queue diff metric for disabled clusters (#4686) 35ae7e7f1 Fix canary/bench dev configs (#4675) 27afb62bd Fix remote sync match for standby domains and task creation time (#4654) 32123b941 Fix NPE in GCP archival (#4626) e13b6686a Fallback to zero value for non-present parent execution fields (#4617) 0b4768365 Fix get replication task read level update issue (#4607) 110a13519 Fix NPE when replicating child started event (#4591) a76d1a56f Set workflow start time in application layer (#4581) e3725619d URL encode postgres credentials (#4550) 384a3a4b1 Update shard context to reduce DB calls for closed shards (#4547) a590cacc3 Fix record child completion error handling (#4515) 3eda7b983 Fix timer resurrection check (#4499) 3ef32e26e Fix workflow retry policy overriding workflow timeout (#4467) 3e3bb6723 Fix startTime in workflow task refresher (#4488) 7ee533975 Fix nil pointer dereference issue in matching (#4481) c387b4061 Fix workflow refresh for closed workflows (#4472) 723149b54 Revert ratelimiting behavior for frontend worker related APIs (#4435)

Misc improvements & updates

ea4d16525 As-simple-as-possible addition of background-processing ratelimits (#4775) 61c5d58f9 Purge un-hydrated replication tasks when doing dlq merge (#4769) e4cb51fb4 Added history corruption checks for workflow signaling (#4761) 6f3da1cf8 Add more docker compose files for multiclusters with ES and different DB (#4757) d83fa3a8f Log more tags while putting to replication task to dlq (#4754) 71c277498 Add rate limits for inline archival (#4743) a19271324 Lowers priority of a significantly noisy log entry (#4730) 9abf34d09 emit task list backlog metric in matching (#4734) e27ed8098 Update error message for too frequent domain updates (#4711) 99ace7ae5 Upgrades python version for cqlsh (#4709) a0be5926b Improve standby task processing (#4695) d117c3cf6 Update ElasticSearch version to fix log4j issues (#4691) 9c9566163 Update Base and Builder Images to Apline 3.15 (#4689) 37a6cfd5b Emit persistence latency histograms (#4678) 97300ecfa Add jitter for starting failover queue (#4683) e0a1d20fc Add support for custom cassandra authenticators (#4676) f3003449d Improve error message when a workflow's decision is rejected due to concurrent changes (#4673) b7303535a Support partial retries for ApplyParentPolicy (#4610) 3b36196e6 Add logs for ID length violation checkers (#4655) 0aa749438 Use streaming implementation of thriftrw encoder/decoder (#4568) 4461fe020 Fix build by swithcing to golang:1.17-alpine3.13 (#4641) 2957a708c Apply ConnectAttributes for postgres (#4639) e2adab995 Add prometheus metric for canary/bench (#4625) 2e75c15f4 Update docker image to use 1.17 golang (#4632) d61a3b3b2 Update golang/mock to v1.6.0 (#4636) 97a413f34 Fixing master build (#4631) 2b08e4002 Add dynamic config to disable generate replication tasks (#4608) 39b28e667 Add ctx parameter to queue processor actions (#4600) fa6d71d1d Limited retry for normal decision scheduleToStart timeout (#4567) 942f0327c Change canary back to start both worker and starter by default (#4587) 5affa4f77 Fx docker template and canary batcher workflow (#4585) 0e4b8d39f Add cross region sample config and minor fix docker template (#4574) 49aa6352e skip eror on creating domains for canary (#4584) 93934ab5b Show more information about panicked objects (#4575) 8e8e943fb Fix build on M1 macs (#4576) bd21dc422 Remove github automerge job (#4577) 87b2eae36 Support filtering global domains in ElasticSearch (#4539) f8f95d5eb Improve server_test and instructions on homebrew and dynamic config docs (#4557) 2719202a1 Upgrade yarpc to v1.58.0 (#4564) 11e9a7bcf Improve archival history mutated error logs and add option to allow archiving incomplete history e1e94ee02 Add a hard limit for number of decision retry attempts (#4490) 19a954a3c Fix docker config template for secondary cluster (#4551) 37e9845bd Add console as logging encoding type (#4549) 3cd5166d1 Log WorkflowID, RunID, domainName when a workflow times out or gets terminated (#4548) c7e94f29b Upgrade golang/sys for go 1.17 (#4543) e94968737 Add server startup test to prevent docker image master-auto-setup getting broken 3ab48432d Add sample config to run with opensearch (#4512) 7e60482b7 Add disableSniff: true flag to update-development_es_v7.yaml (#4501) 2679a9ce2 Managed failover workflow improvements (#4491) 607893ddc Verify delete history branch actually delete targeted nodes (#4494) 3bbf94583 fix flaky TestWorkflowRetryPolicyContinueAsNewAsCron integration test (#4496) ea89af4d9 Fix access control admin handler initialization (#4500) c2264524e Update get tasklist response (#4483) 9e992726a Emit metrics during shardscanner fix and scan activities (#4476) ef7d04914 Add documentation to canary and improvements (#4447) df3e5524b Fix/docker multicluster (#4473) 39184e611 Add tagged loger to shardscanner (#4466) a695fc053 Set limit on range queries to prevent bad queries causing degradation (#4458) a5a722314 Validate shard ID range (#4461) 73579667d Remove previous domain lists from domain cache (#4463) ed86ca88a Matching per domain RPS limit (#4433) 7ca1886f7 Misc improvements: development & sample config & README & Issue Template 4969e353a Allow replicate history generated from same cluster (#4429) 7ef52d582 Add feature flag on emitting signal name metric tag (#4434) 482f47859 Updated yarpc to v1.56.0 (#4423) e974a2ad2 Fix auot-build script typo

Cleanup & Refactoring

5ec6c4efc Drop getters on HistoryEvent (#4727) 7084679d5 Staticcheck cleanup (#4751) eff46e0a4 Reuse rate limiter functionality (#4742) 1fd8af37b Added AsFloat64 helper to convert rps related dynamic config properties (#4740) 32cf61233 Drop redundant pointer conversions (#4726) b8d47dc42 Drop pointers on internal history branch types (#4718) b0dff806c Removed unused parameter (#4719) a964dfbd6 Do not pass yarpc options in CLI (#4712) 429db9028 make fmt (#4658) 307dd93f8 Fix badly formatted files (#4647) a3fdf4c5b Cleanup unused setters in client.Bean (#4555) 13c6a2b10 Refactor and improve the retry logic to avoid throttling dependencies (#4351) 5428e35f6 Refactor service naming constants (#4516) 467824e72 Reshuffle service.Config and resource.Params (#4514) 132768ac8 Move serviceImpl to integration tests (#4513) 8ccb533b0 Minor ratelimit test cleanup 82d83f924 Refactor task executor tests (#4475) 93937370e Refactor config methods for internal use (#4448) 88549ddba Refactor test for internal integration tests (#4437)


Release v0.23.2 Patch Release published by yycptt on 12/2/2021

Release commits

Bug fix

ff5ef719a Fix ResponseInfo to work on all transports (#4649) 97127f0e0 Fix remote sync match for standby domains and task creation time (#4654)

Improvement

b4b94c6d7 Create ElasticSearch client via factory (#4660) 831dc7f0d Shuffle responses for replication messages (#4652) d367a88be Add Redirect policy to forward all domain APIs (#4657) 00bbe50d4 Add logs for ID length violation checkers (#4655)

Misc.

8dd7a0818 Update docker files for 0.23.2 release


Release v0.23.1 Release published by yycptt on 11/19/2021

Upgrade instructions (from 0.22.x releases)

Schema upgrades (required)

  • Cassandra: upgrade schema to 0.32

Configuration changes (optional but recommended)

  • Change clusterMetadata to clusterGroupMetadata
  • Change clusterMetadata.masterClusterName to clusterGroupMetadata.primaryClusterName
  • Change clusterMetadata.clusterInformation to clusterGroupMetadata.clusterGroup
  • Change dynamicConfigClient to dynamicconfig with client:filebased and move all fields under the old dynamicConfigClient to a new field filebased under the new dynamicconfig.
  • publicClient is no longer required. If not specified, will default to current cluster's RPCAddress in clusterGroup
  • Sample config

Release Commits

New features

Config Store

0fd2b50e Added config store functionality (initial implementation) (#4357)

Cross Domain Operations

38881a8f Add X-Cluster Child Workflow Completion Tasks (#4336) 40c5f189 Implement history handler for fetching and responding cross-cluster tasks (#4329) adbffa45 Notify queue processor about cross cluster tasks (#4328) 23eb8bea Improve is cross cluster task check (#4326) af307533 Update admin CLI commands for cross-cluster queue (#4321) 58e8e1cc Add cross cluster target task executor (#4317) 8d319e20 Refactor task executor interface (#4300) de9a0869 Add cross cluster queue processor (#4269) 4384e4cc Target cluster cross cluster task processor (#4292) 6964885b Improve definition for cross-cluster related endpoints (#4294) eead0e57 Add client and handler implementation for cross-cluster task APIs (#4286) 622b13b2 Add internal types and mappers for cross cluster related APIs (#4285)

GRPC

4b95ec80 Fallback to zero value for non-present parent execution fields (#4617) dbe538e5 Switch canary to gRPC (#4570) 59c8f0ec gRPC for cross DC traffic (#4390) 5328cba2 Expose frontend gRPC port on docker containers (#4312)

Auth

8b8d8d80 add workflow type to signal with start auth eacf42f1 add enable service auth logging key d1a3c112 extend permission attributes for service auth 35f588f4 Add authorizer protection for AdminAPI 9a46d9d4 Feature cont.: authorize CLI as admin with private (#4338) 37706b2f Update OAuth implementation to use domainCache to authorize (#4333) 0085b7ab CLI sending authorized request (#4327) 989e35c8 Add Permissions to Attributes and reading Public/Private key from disk instead of reading it from yaml file (#4320) 9f5d4120 Implement OAuth Authorizer (#4306) deed4824 add enable service auth key (#4299) 51be8208 fill tasklist in auth attributes for poll APIs (#4296) 70f3f582 add tasklist to auth attributes (#4288)

Bug Fixes

4808e654 Fix NPE in GCP archival (#4626) 49df6717 Handle error case in response info middleware (#4609) efb7b083 Fix get replication task read level update issue (#4607) 144d6946 Fix NPE when replicating child started event (#4591) 0a1337c0 Fix cherry-pick for docker config template from #4585 1affb65a Fix mysql insecure hostname verify didn't work (#4569) 3fd8001a Fx docker template and canary batcher workflow (#4585) 364b2a17 Fix and improve canary thrift config and docs (#4580) f744a6f2 Fix record child completion error handling (#4515) 3cb214af Fix access control admin handler initialization (#4500) 0398bf6f Fix timer resurrection check (#4499) 39f45eb8 Fix startTime in workflow task refresher (#4488) dfca8e13 Fix nil pointer dereference issue in matching (#4481) 5dd7eb7c Fix workflow refresh for closed workflows (#4472) 76573a2c Fix domain updating via grpc (#4418) f52498a8 Fix admin workflow re-replicate command (#4325) abe22845 Fix ndc reset workflow replication bug (#4376) a58b8b96 Fix oauth yaml config (#4360) 8ea6a6a8 Fix CLI jwtKey npe issue (#4358) cd9a33ab Fix deadlock in transfer queue (#4337) fbc79f96 Fix CLI admin domain bug that didn't load Cassandra plugin bec009ab Fix missing activity failure details in standby cluster (#4323) a1b9679a Fix typo in docker config template which cause docker image corruption (#4310) 6a00f35a Fix a racy read in test (#4291)

Improvements

d53b1fb1 Support DNS SRV Records within Ringpop (#4614) 2c3a8f39 Change frontend drain time 76653c16 Limited retry for normal decision scheduleToStart timeout (#4567) dbae1304 Improve archival history mutated error logs and add option to allow archiving incomplete history 4048370c URL encode postgres credentials (#4550) 41e9b530 Add console as logging encoding type (#4549) 0332c595 Log WorkflowID, RunID, domainName when a workflow times out or gets terminated (#4548) df0c4bf4 Change canary back to start both worker and starter by default (#4587) 39b1970b Update cadence go SDK for building canary in 0.23.x (#4586) dcfe3f6d skip eror on creating domains for canary (#4584) f53bec11 Add documentation to canary and improvements (#4447) e052190a Add size checker when replication messages return (#4521) 16aed764 Added response info middleware (#4522) a6a17935 Adding debug metrics in domain callback (#4484) e602b8c2 Set limit on range queries to prevent bad queries causing degradation (#4458) 6b9184c4 Refactor config methods for internal use (#4448) d58d3460 Add feature flag on emitting signal name metric tag (#4434) db77377c Refactor test for internal integration tests (#4437) d67fb41b Revert ratelimiting behavior for frontend worker related APIs (#4435) 0b980557 Rewrite/improve basic load test (#4399) cde0f413 Dynamic replication batch size (#4301) efb9f901 Long poll completion buffer to prevent timeouts (#4425) 45c7b4c1 Improve/simplify archival config validation (#4366) fb10abe0 Automatically adjust task priority and redispatch interval based on attempts (#4378) 6dec5aaa Disable basic(db) visibility sampling by default (#4407) 76ec20af Emit logs with workflow execution tag for timedout frontend requests (#4379) 170deed4 Try detecting timer and activity resurrection (#4375) 7110f057 add decision result count check (#4402) 8ad444b9 Add context metric tags for admin handler (#4404) c6ef3c9d Refactor ClusterMetadata defaults and validation (#4385) 185988a3 Add docker file/config/compose for bench and canary 3255b7c4 Failover metrics scope improve (#4391) 4781a8dc Expose invalid timer value in the error message (#4380) f621c7c5 Fill currentCluster RpcAddress with publicClient as default 58ae9055 Optimization for start child workflow task (#4315) 71e730f5 Allow removing replica from domain replication group (#4346) b32af800 Enalbe batch job feature by default and update dynamic config docs (#4343) 70bc1508 Parallelize GetWorkflowExecution SQL calls (#4339) 28e0489e Server and CLI use version from release and versionChecking constant and commit revision (#4308) ffbfdb7d Remove unused replicationConsumer related config (#4324) 66f2f261 Refactor Cassandra test utility for NoSQL support 7db76544 Fix MaximumSignalsPerExecution documentation default value c63aa780 Add persistence error logs to queue manager (#4318) deb0caf0 Update Mutable State to reduce unnecessary update to DB (#4304) 572582c5 Refactoring Cassandra workflow persistence manager for NoSQL support-Part 3 c185ad8c Deep merge config files (#4165) d91e86fe Enforce context timeout for retry policies in execution context and cache (#4303) a24af638 Refactoring Cassandra workflow persistence manager for NoSQL support-Part 2 17663af5 Add domain tag for skip task metrics and logs (#4293) 28bb116f Remove tasklist kind from tasklist id (#4295) 94b24053 Implement new matching and frontend API to get all tasklists for a domain (#4175) ff0046f0 Rename cassandra files to nosqlStores 3cc8c310 Allow skipping optional tests for optional methods in plugins (#4287)

Misc.

26189983 Update docker files for 0.23.1 release 01f09391 Update CHANGELOG.md (#4405) 44392dc5 Update community links for Discussion 74207864 Improve contributing and dev process (#4347) 6f989a3d Cleanup lint warning (#4309) 24cd8fa6 Clean up linting warnings (#4290) 7e88e6ec Ignore bench and canary test coverage (#4297)


Release v0.22.4 Patch Release published by longquanzheng on 11/11/2021

Schema/configuration change

None.

Release commits

Bug fix

  • cfa1fa257 - Fix NPE in GCP archival (#4626)

Improvement

  • 3e7291fe9 - Support DNS SRV Records within Ringpop (#4614)

Credits

Thank you @lindleywhite for the contribution!


Release v0.22.3 Patch Release published by longquanzheng on 10/14/2021

Schema/configuration change

None.

Release commits

Bug fix

e710c0fa5 - Fix mysql insecure hostname verify didn't work (#4569)

Improvement

Note

Credits

Thank you @git-hulk for the contribution!


Release v0.22.2 Patch Release published by longquanzheng on 10/14/2021

This patch release introduces a dynamic config option worker.AllowArchivingIncompleteHistory. If you enable this to true, archival will archive the history even on "history mutated" error.

Schema/configuration change

None.

Release commits

Bug fix

None

Improvement

ff344e641 - Improve archival history mutated error logs and add option to allow archiving incomplete history (#4541)


Release v0.22.1 Patch Release published by longquanzheng on 10/8/2021

Schema/configuration change

None.

Release commits

Bug fix

Improvement

84b8d79ce - Add console as logging encoding type (#4549) 9336ed963 - Log WorkflowID, RunID, domainName when a workflow times out or gets terminated (#4548)

Credits

Thank you @@agrimrules @WToma for the contribution!


Release v0.22.0 Minor Release published by Shaddoll on 8/20/2021

Upgrade instructions

Schema upgrades required if applicable

  • Cassandra: upgrade schema to 0.31
  • MySQL: upgrade schema to 0.5
  • Postgres: upgrade schema to 0.4
  • ElasticSearch: Add new system search attribute IsCron of boolean type. This is supported by Cadence to indicate if a workflow is a cron workflow

Configuration changes(optional but recommended)

  • Change persistence.datastores.<name>.cassandra to persistence.datastores.<name>.nosql with pluginName:cassandra
  • Set clusterMeatadata.enableGlobalDomain to true to enable global domain by default See example configuration
  • DB based visibility is no longer required if you have ES based visibility. You can safely remove it if so.

Schema

5848c942 Support Cross Cluster Operations in SQL type DBs (#4271) 6f77ae2d Cassandra implementation for cross cluster queue (#4237) 53edb8d8 Support is_cron field for SQL/Cassandra/Postgres (#4191) fc63ab11 Support Listing Scheduled Cron Workflows (#4176)

Features

Feature Flags

0f93d577 Support Feature Flags (#4257)

Cross Cluster Task

54d6ccb8 Improve is cross cluster task check (#4326) ddfc1278 Convert transfer to cross cluster task if target domain is active in remote cluster (#4268) ae40d3e6 Thrift and Proto definition for fetching and responding cross-cluster tasks (#4276) 6e3a78de Add cross cluster task interface (#4264) 42caee54 Generate cross cluster tasks in mutable state task generator (#4250) 3eda53be Update shardContext for getting and updating cross-cluster queue states (#4246) bda4c5c2 Add cross-cluster task related types and methods to data/persistence interface (#4225) d5fbc5e3 Add cross cluster processing queue state to data/persistence interface (#4230)

CLI

263227f0 Update admin rereplicate command to support SQL databases (#4227) 758ace24 Update Admin CLI workflow operations to support SQL databases (#4222) 3866711f Add Admin Shard List Command (#4224) e3e0c263 Update Admin CLI shard operations to support SQL databases (#4207) cade91a7 Update admin DB commands to support SQL databases (#4198)

Documentation

9b4aaa67 Add design doc for workflow shadowing (#4137)

Bug Fix

ad0261df Fix ndc reset workflow replication bug (#4376) 45d5eca2 Fix deadlock in transfer queue (#4337) 83feeffa Fix missing activity failure details in standby cluster (#4323) 407c3664 Support prefix match for list domain command (#4275) 8143d555 Fix proto FromParentExecutionInfoFields mapper (#4262) 23fbb4fa Fix styles and instructions in docs (#4260) 7af95b70 Fix Cassandra schema version check (#4247) b6601123 Fix persistence test for task completion (#4245) e2cde4af Domain queue ack level fix (#4218) 2fb5c02b Pick signal requestID dedup fix from temporal (#4210) 61399e53 Fix update domain replication ack level (#4212) 3e2ffc33 Fix docker-compose bug: publicClient is not valid and error

Improvements

362de93e Parallelize GetWorkflowExecution SQL calls (#4339) 1c62c216 Server and CLI use version from release and versionChecking constant and commit revision (#4308) c75cbf8c Update Mutable State to reduce unnecessary update to DB (#4304) b7043ed8 Enforce context timeout for retry policies in execution context and cache (#4303) dad87f7f Add domain tag for skip task metrics and logs (#4293) 4305fc03 Allow skipping optional tests for optional methods in plugins (#4287) b25da7ec Use global domain by default 1cd5fd2c Remove unnecessary nonzero requirements in some fields of NoSQL config struct c4f2c09f Adding protoversion in config file (#4263) 12ddac4f Remove IsConditionalFailed from NoSQL ffe4e2fa Refactoring Cassandra workflow persistence manager for NoSQL support : Part 1 e4a59175 Task fetcher implementation (#4272) 99fb2163 Rename persistence files for consistency (#4256) ff5e37ad Add common Future util implementation (#4266) e528ef3a Remove timerGate usage in transfer queue processor (#4248) e8cc575e Add version check for JavaConsistentQuery 2b684814 Preserve some time to return part of the result if the context is timing out (#4265) 7e6fbdbd Adding protoversion as a config value (#4252) ea912b04 Update generating close event to use event version (#4261) 0b058c0e Rename MetadataManager to DomainManager 208edf41 Refactor: not require db visibility when ES visibility is provided (#4241) 723ecf57 Refactoring Cassandra task persistence manager for NoSQL support 532dfb10 Add placeholder for dynamo visibilityCRUD methods (#4244) 8d3e54e3 Remove transfer/timer failover queue level from persistence layer (#4232) 2c76237f Refactoring Cassandra visibility persistence manager for NoSQL support e1fb9d77 Add logic in replication stack to handle different shard number (#4239) de3a6ac4 Raise VisibilityListMaxQPS limit for local docker-compose env 84059fb3 Change domain data CLI parameter for consistency (#4240) 932303f9 Add conn_attrs flag to admin db commands (#4226) ae963c19 Remove unused ResetWorkflowExecution method from persistence (#4236) 58df9131 Regenerate proto types (#4234) 466c7f57 CLI: Differentiate local and global in describe domain 640d3783 Add skeleton of other nosql plugin and add dynamodb package a49e2daa Refactoring Cassandra shard persistence manager for NoSQL support 749c80dd Minor improvements on NoSQL interfaces and implementation (#4219) 758123f8 Remove unused now parameter from task generator interface (#4217) 2843e428 Shift pointer conversions down to thrift encoder (#4206) 7e333a8d Wire up rebalance CLI (#4213) 1f14ebf6 Add rebalance workflow (#4199) 68dc1ed5 Add more comments about how to implement the interfaces in NoSQL 7f49a0ff Introduce Options for task list scavenger (#4205) d90f87af Refactor: move errorChecker to nosqlplugin pkg d15fbe51 Tidy go mod (#4193) b2adb71e Update Master cluster to Primary cluster (#4185) 1d9144df Not allow more than one types of metric reporters (#4180) 95713dbe Support managed failover workflow with drill mode (#4177) 5a9a1efa Expose max message size config for gRPC (#4166) b6b70ffb Detect when IDL submodule is not initialized, warn user (#4172) 386e1e70 Expand workflowCompleted error to other APIs (#4163) 800107ad Add debug info when List API is hititng ratelimiting/sampling ac856a63 Add Signal Name to Signal Metrics (#4161) 7457be79 Elasticsearch TLS support (#4154) 36f73881 Assign CLI version on CLI commands (#4153)

Misc.

ec3596a Update docker for 0.22.0 patch release

Credits

Thank you @spmistry for the contribution!


Release v0.21.3 Release published by yycptt on 6/17/2021

Features

05fb808e Support delaying cron workflows (#4196) 875b690c Update admin db commands to support all registered SQL dbs (#4189)

Bug Fix & Improvements

0cc2fe37 Rollback workflow already completed error (#4209) 5ad53e3c Fix domain last updated time in updateDomain Operation (#4211) fce6eaa8 Check identity length before RespondDecisionTaskCompleted to history (#4215) 49bc2c0e Add dynamic configs and metrics to control/monitor per id lengths (#4173) d55f707f Fix describeWorkflowExecution in passive cluster (#4203) 94a09915 Use frontend client in parentClosePolicy workflow to allow auto-forwarding (#4202) 50db73ee Add nextEventID for frontend PollForDecisionTask response (#4190) d893a9c5 Fix sql error conversion (#4194) 54d2653e Improve error handling for SQL persistence implementation (#4178) 40fb112a Fix proto mapper for Payload (#4184) cd40ea38 Handle workflow not exist error when archiving history (#4181)

Misc.

36cfde8b Update docker for 0.21.3 patch release 73a2ceb7 Go mod tidy


Release v0.21.0 Release published by just-at-uber on 5/7/2021

Added feature and optional config change

  • Added GRPC support. Cadence server will accept requests on both TChannel and GRPC. With dynamic config flag system.enableGRPCOutbound it will also switch to GRPC communication internally between server components.

Please configure gRPC ports to all four services following the example

Shadower Workflow Feature

  • 08128b113 - Workflow shadower bug fix and improvements (#4117)
  • 28fd61d8e - Shadower workflow implementation (#4042)

Bug fixes

  • b29845a06 - Fixing typos in README.md and comments File (#4142)
  • 429cb1130 - Bug fix: Declare CloseStatus and HistoryStatus as integers for development dynamicconfig (#4125)
  • 07d537122 - Fix don't swallow the error when building zap logger (#4102)
  • d2c25ac79 - Fix docker build with empty idls
  • 22f5e06f5 - Fix missing domain field when creating activity scheduled event (#4087)
  • f3eaa110d - Fix emit 0 count if encounters error (#4073)
  • d5cc770c0 - Fix domain registration error handling in cassandra (#4066)
  • b78809c69 - Fix time convertion to nanos (#4063)
  • 480cf5ee9 - Fix Dockerfile (#4083)

Tests

  • 24cefa695 - Fix race condition in task re-dispatcher test (#4147)
  • fa282dc59 - Fix flaky Cron integration test
  • f50a0fe4b - Fix flaky test (#4116)
  • af535dcc1 - Fix regression introduced by integration test refactoring (#4109)
  • d3d456138 - Fix lint error in integration tests (#4100)
  • 8fdedfb86 - Seperate integration test suite definition from actual integration tests (#4098)
  • 487653fa3 - Add larger test timeout for large test (#3894)
  • 5a10dc4fe - Fix flacky integration tests (#4094)
  • dfc42d39f - Refactor integration tests (#4091)
  • 471f99be2 - Fix flaky integration test (#4092)
  • e1ad09bb5 - Fix CLI unit tests (#4090)
  • d070b9dbf - Revert "Drop pointer on ActivityTaskScheduledEventAttributes.Domain (#4009)" (#4082)

CLI

  • cf253f0f5 - Use go client version v0.17.0 (#4164)
  • 06149bd9c - Bump cli version to 0.18.4 (#4150)
  • 45ef498b5 - Update readme for CLI homebrew (#4135)
  • 7bd910513 - Support CLI Delay Start (#4097)
  • 00fafac0c - Update CLI usage for IDReusePolicy: TerminteIfRunning

GRPC

  • bc1181155 - Replace gRPC port when resolving host address (#4174)
  • 04f16e751 - Set grpc ports in docker config template (#4146)
  • d0a8f7e6a - Created GRPC handlers and internal clients (#4057)
  • d6c50f5cc - YARPC error mapping (#4070)
  • 329357ce0 - Remove omitempty for fields used in PersistenceInfo (#4151)
  • 30702c2e0 - Moved out public proto IDLs to cadence-idls repository (#4152)
  • 69d742930 - Add PersistenceInfo to DescribeCluster proto message (#4148)
  • 21b21cdbd - Add PersistenceInfo to DescribeCluster thrift response (#4149)
  • 6c06e562e - Add persistence info to DescribeCluster API call (#4136)

Others

  • 430eb7491 - Allow updating ElasticSearch only when adding new search attributes
  • 4a71b129a - Update SQL task manager to support sharded SQL DB (#4171)
  • fd3822989 - Update workflow execution metrics (#4159)
  • 62a77458d - Add Documentation for dynamicconfig: value type and allowed filters
  • 12579bd15 - Support workflow already completed error (#4123)
  • 37b8ea448 - Config layout refactor: centralize config file for documentation
  • aac9dbabf - Bump supported client feature version (#4145)
  • 7057e5e87 - Improve shadow workflow (#4144)
  • a86abb482 - Replace statsd with prometheus as default in docker-compose
  • d53e6319c - Wrap shadow workflow config inside SideEffect (#4140)
  • 8d3519b24 - Update domain deprecation command to fail if workflow exists (#4126)
  • 6fea8dd7c - Add print json flag to list domains command (#4134)
  • 6ca6e0ecc - Refactor dynamicconfig comments for documentation
  • d07d9805a - Remove active domain check in shadow workflow (#4133)
  • acd25964a - Update domain registration to check required data field (#4132)
  • 1fd6a57ae - Revert changing HistoryLength and CloseStatus to string (#4130)
  • 801fda775 - Improve tasklist and implement history scavenger for SQL (#4059)
  • b031307b3 - Refactor config: move config and dynamiconfig one level up (#4118)
  • 7fa8f081c - Allow setting levelKey for logging level (#4120)
  • bd6c19c69 - Improvements for show history from archival
  • 3a1e1bcd7 - Expose server version tag to be set during build time
  • 042923ff5 - Add proto changes for delay start field (#4104)
  • 93a3e94a2 - Forward reset request to the active cluster (#4114)
  • 0bc3681ed - Add more options to config_template for docker image (#4084)
  • 2a7b75d6b - Add version to Kafka config (#4072)
  • caf8fbd98 - WorkflowExecutionCloseStatus proto mapping (#4080)
  • 1f27aaedc - Return plain yarpcerrors.Status with no internal error match (#4081)
  • 84c938a01 - Treat persistence timeout as transient error (#4088)
  • bd5859bd0 - Improve DLQ merge and purge command (#4075)
  • e286ca01c - Add domain queries to scanner and fixer, and skip terminated scans (#4079)
  • 7661a45f7 - Update bug report and PR templates
  • 560b268f7 - Release current workflow lock on create as zombie (#4085)
  • ec3d1b5d3 - Add a domain cluster filter during replication (#4069)
  • 049196e80 - Regenerate files with thriftrw v1.25.0 (#4074)
  • dca153d73 - Use gogofast to register types using gogo registry (#4071)

Release v0.20.0 Release published by yux0 on 4/16/2021

New Features & Improvements

GRPC

5ffd942d - Drop pointer for event id/version fields (#3948) 663f74cf - Drop pointer for signal name fields (#3966) 7af92bef - Drop pointer fields on remaining domain fields (#3972) fba536b1 - Drop pointer on marker name fields (#3971) 67ae5376 - Drop pointer for security token fields (#3970) baf151a1 - Drop pointer for task id fields (#3969) 523edcec - Drop pointer on visibility query fields (#3968) 3c03ea2f - Revert changes to use string casting for shard ID (#3991) b877308b - Drop pointer for shard id fields (#3996) 0035139a - Drop pointer for DomainTaskAttributes fields (#4003) c6d962ea - Drop pointer on cause/reason fields (#4002) cf67a20f - Drop pointer on BadBinary fields (#4001) cf5c9cc7 - Optional fields in proto IDL requests (#3989) 4cd869c4 - Drop pointer on ActivityTaskScheduledEventAttributes.Domain (#4009) da876dfb - Add oneof for DescribeHistoryHostRequest in proto IDLs (#3988) 24509c20 - Move non-public proto errors to internal api (#4011) 0921bf56 - Add missing fields/messages to proto IDLs (#4017) 0e34047c - Drop pointer on remaining fields (#4008) c5d67e3e - Use common ParentExecutionInfo in proto IDLs (#3997) f2f10d4d - Proto mappers (#3902) a8a94a21 - Revert "Drop pointer on ActivityTaskScheduledEventAttributes.Domain (#4009)" (#4082)

Bench test

117d6c18 - Open source bench tests (Part 1) (#3990) 1106daa6 - Open source bench test (Part 2) (#3998) b5ce3140 - Open source bench test (Part 3) (#4005) 174ca093 - Bench test README (#4012)

Task processing

528c7142 - Deprecate old task processing logic (#3938)

Replication

aa93e228 - Create non-transient decision when current version updates (#3960) 6c8d0486 - Release current workflow lock on create as zombie (#4085)

CLI Improvements

8852cfbb - cli v0.18.3 (#3959) 8aba0537 - Fix dryrun option for SQL/Cassandra schema update command

Domain deprecation

586704e2 - Implement domain deprecate CLI command (#4030) 0c46f3d4 - Update DeprecateDomain handler to handle global domain deprecation (#4029) 1e8b738e - Update list domain CLI command to list deprecated domains (#4039) b399bd50 - Disable starting new workflows in deprecated domain (#4038)

Workflow shadower

a2789954 - Update IDL for workflow shadower (#4035)

Bug Fixes

7882518e - Provide background context to all scanners (#3965)

Misc.

540ea2ca - Add missing -race arg on a couple test targets (#3957) ce50c569 - bumping default QPS since frontend is making double calls to workflow list. (#3958) 3ea200e8 - Attempt to make some racy tests more reliable (#3961) 8cf308dc - Flatten UpdateDomainRequest (#3954) c244e9d0 - Fix xdc failover integration test (#3967) d39d4287 - Fix IP address in docker template (#3983) 59a98cf6 - Kafka TLS: allow CA(RootCAs) or cert/key(certificate chains) only c5334afa - Use a test-logger in tests rather than stdout (#3976) 4b548e27 - Update cadence client (#3995) 2f822ee6 - Rename metric name SignalInfo to SignalInfoSize (#3993) 0a6ed6e6 - Allow configuring MaxWorkflowRetentionInDays (#3982) 456286a4 - Fix some racy reads in tests (#3975) aecaa886 - Pretty-simple "find dependency updates" helper (#4000) c9f200a5 - pick child workflow id check (#3994) e43f52ad - Move tcheck to go.mod, get rid of glide dependency (#3999) 99430f5a - update goveralls (#4004) addbdd47 - Default to non-verbose make test output (#4010) dfde9ba5 - Decouple decision handler and history engine impl logic (#3973) 94223715 - feature/changelog (#4014) 80b7bfc7 - Move workflow utility components to separate package (#4015) ef9a4181 - Add more default Prometheus histogram buckets for non-latency timer metrics like history_size 4ed0171f - Add helper methods for sql plugin registeration (#4024) 6b048ce4 - Move decision related components to a separate package (#4021) c0f1252a - fixed the poller count metric (#4020) 0c9a74d3 - Add support for ScyllaDB (#4022) 7bbf6dd7 - Log info message when outstanding poller is canceled (#4031) aae8fb94 - Add compose file for multi-cluster setup (#4032) 150911cf - Increase poller history LRU cache size (#4033) 996cd8a7 - Cadence client downgrade to v0.15.0 bfcb03fe - Move visibility manater setup out of persistence base test (#4062) a3708fb1 - Fix missing root dir prefix when loading dynamic config (#4056) ed82bb73 - Refresh gocql session on no connection error (#4058) 86e6c7c3 - Enforce context timeout in matching AddTask (#4048) 2458bdbf - Minor updates: improve README and add comment for dynamic configuration (#4053) f722cc0a - Emit task list lag metric in matching (#4051) 6a80bf1a - Add the service name to the log entry (#4044) 764500ac - Switch to gogo and yarpc compilers (#4040) 1de8cd41 - Fix prometheus metric error for dual emitting (#4047) e7378622 - Fix the metric ActiveClusterGauge should be as common metric (#4043) 378111f4 - Context metrics tags to pass transport tag (#4037) d7efde44 - Fix warnings when compiling the source code (#4036) 1b3436cf - Rewrite makefile to be reliable (#4034) 175850a2 - Update docker for 0.20.0 release 3a7f18a1 - Fix lint error in integration tests (#4100) 9b62895d - Seperate integration test suite definition from actual integration tests (#4098) f828ab99 - Treat persistence timeout as transient error (#4088) 98ff0e59 - Refactor integration tests (#4091) 472f7578 - Fix Dockerfile (#4083) 03cdc869 - Regenerate files with thriftrw v1.25.0 (#4074) 22ee3c0b - Fix missing domain field when creating activity scheduled event (#4087)


Release v0.19.2 Release published by yux0 on 4/16/2021

Bug Fixes

d3c287e1 Provide background context to all scanners (#3965)



For a comprehensive list of changes and detailed release notes, please visit the Cadence GitHub Releases page.