# damocles-worker task management

In the previous document, we mentioned that in the damocles architecture, the process management of a sector is by worker.

Therefore, the management of sector tasks, especially exception handling, is also performed by the worker instance where the sector is being processed.

However, it must be very inconvenient if all status checking and exception handling require remote access to the corresponding worker machine to operate.

Therefore, in v0.2.0 and later versions, workers report status to sector-manager, and sector-manager manages workers remotely.

Below, we will explain both how worker can be managed locally and how sector-manager manage workers remotely.

# damocles-worker local management

The local management of damocles-worker is mainly through a set of tools provided to call the management interface to operate…

./dist/bin/damocles-worker worker

With subcommands like…

  • list
  • pause
  • resume

# list

list is used to list the current state of all sealing_threads in the currently running damocles-worker instance.

damocles-worker worker list

Let's take the mock configuration in the codebase as an example:

$ ./dist/bin/damocles-worker worker list

#0: "/home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store1"; sector_id=Some(s-t010000-2), paused=true, paused_elapsed=Some (17s), state=C1Done, last_err=Some("permanent: No cached parameters found for stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793/cointmpail/var finding [f3793/cointmpail/var] -proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]")
#1: "/home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store2"; sector_id=Some(s-t010000-3), paused=true, paused_elapsed=Some (17s), state=C1Done, last_err=Some("permanent: No cached parameters found for stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793/cointmpail/var finding [f3793/cointmpail/var] -proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]")
#2: "/home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store3"; sector_id=Some(s-t010000-1), paused=true, paused_elapsed=Some (17s), state=C1Done, last_err=Some("permanent: No cached parameters found for stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793/cointmpail/var finding [f3793/cointmpail/var] -proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]")

As you can see, for each sealing_thread , it will list

  • Index number
  • Local storage location information
  • Sector ID (if there is a sector task being processed)
  • If the task is paused
  • Paused time (if there are paused sector tasks)
  • current status
  • Last exception information (if there is a sector task suspended due to exception)

# pause

pause is used to pause the sealing_thread with the specified index number.

$ damocles-worker worker pause --index <index>

Note that:

  • index needs to match the index number from list command.

# resume

resume is used to resume a suspended sealing_thread.

damocles-worker worker resume [--state <state>] --index <index>

Note that:

  • index needs to match the index number from list command.
  • state is optional.

If state is not supplied, the sector will try to restart with the current state; if the correct state value is supplied, it will restart with the specified state

For different sealing_thread task types, the optional status values ​​can be found in 11. Task Status Flow

# damocles-manager remote management of damocles-worker

The management of damocles-worker by damocles-manager is mainly in two aspects:

  1. Receive periodic report information of the worker instance
  2. Call the management interface on the specified damocles-worker instance

Remote management is done through a set of tools provided to call the management interface of damocles-manager, or a proxy call to the management interface of the specified damocles-worker.

./dist/bin/damocles-manager util worker

The subcommands included are:

  • list
  • info
  • pause
  • resume

# list

The list here is used to list the worker profiles that have reported information to this damocles-manager instance, for example:

$ ./dist/bin/damocles-manager util worker list
Name Dest Threads Empty Paused Errors LastPing(with ! if expired)
127.0.0.1 127.0.0.1:17890 3 0 3 3 2.756922465s

As you can see, for each instance, it will list:

  • instance name (if no instance name is specified, it will be the ip used to connect to damocles-manager)
  • instance connection information
  • sealing_thread number
  • The number of empty sealing_thread
  • The number of suspended sealing_thread
  • The number of sealing_thread that have reported errors
  • The interval from the last report to the current time

# info / pause / resume

This set of commands is executed against the specified damocles-worker instance.

Their effects are equivalent to damocles-worker’s own list / pause / resume, which are used in the following ways.

  • damocles-manager util worker info <worker instance name or address>
  • damocles-manager util worker pause <worker instance name or address> <thread index>
  • damocles-manager util worker resume <worker instance name or address> <thread index> [<next state>]

Specific information can be viewed through help, and the definition and effect of parameters are consistent with the damocles-worker management tool.

for example:

$ ./dist/bin/damocles-manager util worker info 127.0.0.1

Index Loc SectorID Paused PausedElapsed State LastErr
0 /home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store1 s-t010000-2 true 13m42s C1Done permanent: No cached parameters found for stacked-proof-of-replication- merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f [failure finding /var/tmp/filecoin-proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]
1      /home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store2  s-t010000-3  true    13m42s         C1Done  permanent: No cached parameters found for stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f [failure finding /var/tmp/filecoin-proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]
2      /home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store3  s-t010000-1  true    13m42s         C1Done  permanent: No cached parameters found for stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f [failure finding /var/tmp/filecoin-proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]