# damocles-worker task management
In the previous document, we mentioned that in the damocles
architecture, the process management of a sector is by worker
.
Therefore, the management of sector tasks, especially exception handling, is also performed by the worker
instance where the sector is being processed.
However, it must be very inconvenient if all status checking and exception handling require remote access to the corresponding worker
machine to operate.
Therefore, in v0.2.0 and later versions, workers report status to sector-manager, and sector-manager manages workers remotely.
Below, we will explain both how worker can be managed locally and how sector-manager manage workers remotely.
# damocles-worker local management
The local management of damocles-worker
is mainly through a set of tools provided to call the management interface to operate…
./dist/bin/damocles-worker worker
With subcommands like…
- list
- pause
- resume
# list
list
is used to list the current state of all sealing_thread
s in the currently running damocles-worker
instance.
damocles-worker worker list
Let's take the mock configuration in the codebase as an example:
$ ./dist/bin/damocles-worker worker list
#0: "/home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store1"; sector_id=Some(s-t010000-2), paused=true, paused_elapsed=Some (17s), state=C1Done, last_err=Some("permanent: No cached parameters found for stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793/cointmpail/var finding [f3793/cointmpail/var] -proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]")
#1: "/home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store2"; sector_id=Some(s-t010000-3), paused=true, paused_elapsed=Some (17s), state=C1Done, last_err=Some("permanent: No cached parameters found for stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793/cointmpail/var finding [f3793/cointmpail/var] -proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]")
#2: "/home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store3"; sector_id=Some(s-t010000-1), paused=true, paused_elapsed=Some (17s), state=C1Done, last_err=Some("permanent: No cached parameters found for stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793/cointmpail/var finding [f3793/cointmpail/var] -proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]")
As you can see, for each sealing_thread
, it will list
- Index number
- Local storage location information
- Sector ID (if there is a sector task being processed)
- If the task is paused
- Paused time (if there are paused sector tasks)
- current status
- Last exception information (if there is a sector task suspended due to exception)
# pause
pause
is used to pause the sealing_thread
with the specified index number.
$ damocles-worker worker pause --index <index>
Note that:
index
needs to match the index number fromlist
command.
# resume
resume
is used to resume a suspended sealing_thread
.
damocles-worker worker resume [--state <state>] --index <index>
Note that:
index
needs to match the index number fromlist
command.state
is optional.
If state is not supplied, the sector will try to restart with the current state; if the correct state
value is supplied, it will restart with the specified state
For different sealing_thread
task types, the optional status values can be found in 11. Task Status Flow
# damocles-manager remote management of damocles-worker
The management of damocles-worker by damocles-manager is mainly in two aspects:
- Receive periodic report information of the worker instance
- Call the management interface on the specified damocles-worker instance
Remote management is done through a set of tools provided to call the management interface of damocles-manager, or a proxy call to the management interface of the specified damocles-worker.
./dist/bin/damocles-manager util worker
The subcommands included are:
- list
- info
- pause
- resume
# list
The list
here is used to list the worker profiles that have reported information to this damocles-manager
instance, for example:
$ ./dist/bin/damocles-manager util worker list
Name Dest Threads Empty Paused Errors LastPing(with ! if expired)
127.0.0.1 127.0.0.1:17890 3 0 3 3 2.756922465s
As you can see, for each instance, it will list:
- instance name (if no instance name is specified, it will be the ip used to connect to
damocles-manager
) - instance connection information
sealing_thread
number- The number of empty
sealing_thread
- The number of suspended
sealing_thread
- The number of
sealing_thread
that have reported errors - The interval from the last report to the current time
# info / pause / resume
This set of commands is executed against the specified damocles-worker instance.
Their effects are equivalent to damocles-worker
’s own list
/ pause
/ resume
, which are used in the following ways.
damocles-manager util worker info <worker instance name or address>
damocles-manager util worker pause <worker instance name or address> <thread index>
damocles-manager util worker resume <worker instance name or address> <thread index> [<next state>]
Specific information can be viewed through help
, and the definition and effect of parameters are consistent with the damocles-worker
management tool.
for example:
$ ./dist/bin/damocles-manager util worker info 127.0.0.1
Index Loc SectorID Paused PausedElapsed State LastErr
0 /home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store1 s-t010000-2 true 13m42s C1Done permanent: No cached parameters found for stacked-proof-of-replication- merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f [failure finding /var/tmp/filecoin-proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]
1 /home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store2 s-t010000-3 true 13m42s C1Done permanent: No cached parameters found for stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f [failure finding /var/tmp/filecoin-proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]
2 /home/dtynn/proj/github.com/ipfs-force-community/venus-cluster/mock-tmp/store3 s-t010000-1 true 13m42s C1Done permanent: No cached parameters found for stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f [failure finding /var/tmp/filecoin-proof-parameters/v28-stacked-proof-of-replication-merkletree-poseidon_hasher-8-0-0-sha256_hasher-032d3138d22506ec0082ed72b2dcba18df18477904e35bafee82b3793b06832f.params]