# Configurations of damocles-worker
damocles-worker is the main execution body of data sealing. Let's take a look at its configuration file structure and configuration options.
The configuration file of damocles-worker
is in toml
format. It should be noted that in this format, lines starting with #
will be regarded as comments and will not take effect.
Taking a mock instance as an example, a basic configuration might look like this:
[worker]
# name = "worker-#1"
# rpc_server.host = "192.168.1.100"
# rpc_server.port = 17891
[metrics]
#enable = false
#http_listen = "0.0.0.0:9000"
[sector_manager]
rpc_client.addr = "/ip4/127.0.0.1/tcp/1789"
# rpc_client.headers = { User-Agent = "jsonrpc-core-client" }
# piece_token = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJuYW1lIjoiMS0xMjUiLCJwZXJtIjoic2lnbiIsImV4dCI6IiJ9.JenwgK0JZcxFDin3cyhBUN41VXNvYw-_0UUT2ZOohM0"
[sealing]
# allowed_miners = [10123, 10124, 10125]
# allowed_sizes = ["32GiB", "64GiB"]
enable_deals = true
# disable_cc = true
# max_deals = 3
# min_deal_space = "8GiB"
max_retries = 3
# seal_interval = "30s"
# recover_interval = "60s"
# rpc_polling_interval = "180s"
# ignore_proof_check = false
[[sealing_thread]]
location = "./mock-tmp/store1"
# plan = "snapup"
# sealing.allowed_miners = [10123, 10124, 10125]
# sealing.allowed_sizes = ["32GiB", "64GiB"]
# sealing.enable_deals = true
# sealing.disable_cc = true
# sealing.max_deals = 3
# sealing.min_deal_space = "8GiB"
# sealing.max_retries = 3
# sealing.seal_interval = "30s"
# sealing.recover_interval = "60s"
# sealing.rpc_polling_interval = "180s"
# sealing.ignore_proof_check = false
[[sealing_thread]]
location = "./mock-tmp/store2"
[[sealing_thread]]
location = "./mock-tmp/store3"
[[attached]]
# name = "persist-store1"
location = "./mock-tmp/remote"
[processors.limitation.concurrent]
# add_pieces = 5
# pc1 = 3
# pc2 = 1
# c2 = 1
[processors.limitation.staggered]
# pc1 = "5min"
# pc2 = "4min"
[processors.ext_locks]
# gpu1 = 1
[processors.static_tree_d]
# 2KiB = "./tmp/2k/sc-02-data-tree-d.dat"
# fields for the add_pieces processor
# [[processors.add_pieces]]
# fields for tree_d processor
[[processors.tree_d]]
# auto_restart = true
# inherit_envs = true
# fields for pc1 processors
[[processors.pc1]]
# bin = "./dist/bin/damocles-worker-plugin-pc1"
# args = ["--args-1", "1", --"args-2", "2"]
numa_preferred = 0
cgroup.cpuset = "4-5"
envs = { RUST_LOG = "info" }
# auto_restart = true
# inherit_envs = true
[[processors.pc1]]
numa_preferred = 0
cgroup.cpuset = "6-7"
# auto_restart = true
# inherit_envs = true
[[processors.pc1]]
numa_preferred = 1
cgroup.cpuset = "12-13"
# auto_restart = true
# inherit_envs = true
# fields for pc2 processors
[[processors.pc2]]
# cgroup.cpuset = "24-27"
# auto_restart = true
# inherit_envs = true
[[processors.pc2]]
cgroup.cpuset = "28-31"
# auto_restart = true
# inherit_envs = true
# fields for c2 processor
[[processors.c2]]
cgroup.cpuset = "32-47"
# auto_restart = true
# inherit_envs = true
Below we will break down the configurable items one by one.
# [worker]
The worker
configuration item is used to configure some basic information of this worker instance.
# Basic configuration example
[worker]
# instance name, optional, string type
# By default, the IP address of the network card used to connect to `damocles-manager` is used as the instance name
# name = "worker-#1"
# rpc service listening address, optional, string type
# The default is "0.0.0.0", that is, listening to all addresses of the machine
# rpc_server.host = "192.168.1.100"
# rpc service listening port, optional, number type
# Default is 17890
# rpc_server.port = 17891
# local pieces file directory, optional, string array type
# if set, worker will load the piece file from the local file
# otherwise it will load the remote piece file from damocles-manager
# If the file "/path/to/{your_local_pieces_dir01, your_local_pieces_dir02, ...}/piece_file_name" does not exist, the worker will also load it from the damocles-manager
# local_pieces_dirs = ["/path/to/your_local_pieces_dir01", "/path/to/your_local_pieces_dir02"]
In most cases, each field in this configuration item does not need to be manually configured.
Only in some special cases, such as:
- If you would like to name each
damocles-worker
instance according to your own naming schemes - If you don't want to monitor all network card IPs and only allow local rpc requests
- If multiple
damocles-worker
s are deployed on one machine, you would like to avoid port conflicts so that they are distinguished - If
damocles-worker
can access the piece_store directory, you can setlocal_pieces_dir
to load piece file locally
And in other scenarios, you would need to manually configure the options here as needed.
# [metrics]
metrics
provides metrics recorder & exporter (prometheus) options.
# Basic configuration example
[metrics]
# Switch of the prometheus exporter, optional,boolean
# Default value: false
# When enabled,field "http_listen" will be used to bootstrap a prometheus exporter
#enable = false
# prometheus exporter listen address, optional, socket address
# Default value: "0.0.0.0:9000"
#http_listen = "0.0.0.0:9000"
# [sector_manager]
sector_manager
is used to configure damocles-manager
related information so that damocles-worker
can connect to the corresponding service correctly.
# Basic configuration example
[sector_manager]
# The connection address used when constructing the rpc client, required, string type
# Accepts both `multiaddr` format and url format such as `http://127.0.0.1:1789`, `ws://127.0.0.1:1789`
# Normally, use the `multiaddr` format for consistency with other components
rpc_client.addr = "/ip4/127.0.0.1/tcp/1789"
# The http header information when constructing the rpc client, optional, dictionary type
# default is null
# rpc_client.headers = { User-Agent = "jsonrpc-core-client" }
# The verification token carried when requesting deal piece data, optional, string type
# default is null
# This is usually set when this instance allows sealing of sectors with deal data
# The value of this config item is usually the token value of the Venus service used
# piece_token = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJuYW1lIjoiMS0xMjUiLCJwZXJtIjoic2lnbiIsImV4dCI6IiJ9.JenwgK0JZcxFDin3cyhBUN41VXNvYw-_0UUT2ZOohM0"
# [sealing]
sealing
is used to configure common parameter options in the sealing process.
# Basic configuration example
[sealing]
# Allowed `SP`, optional, number array format
# Defaults to null, allows tasks from any `SP`
# After configuration, only sealing tasks from `SP` listed in the array can be carried out
# allowed_miners = [10123, 10124, 10125]
# Allowed sector size, optional, string array format
# Defaults to null, allows sector tasks of any size
# After configuration, only tasks that match the sector size listed in the array can be executed
# allowed_sizes = ["32GiB", "64GiB"]
# Whether to allow adding deals to the sector, optional, boolean type
# Default is false
# When set to true, the `piece_token` item in `sector_manager` usually needs to be set at the same time
# enable_deals = true
# Whether to disable cc sector, optional, boolean type
# Default is false
# When enable_deals is true, turn on this option, will keep waiting until a storage deal is obtained, instead of starting sealing cc sectors
# disable_cc = true
# The maximum number of deals allowed to be added to the sector, optional, number type
# default is null
# max_deals = 3
# The minimum size of a deal to be filled in a sector, optional, byte string format
# default is null
# min_deal_space = "8GiB"
# Number of retries when an error of `temp` type is encountered during the sealing process, optional, number format
# default is 5
# max_retries = 3
# Retry interval when a `temp` type error is encountered during the sealing process, optional, time string format
# The default is "30s", which means 30 seconds
# recover_interval = "30s"
# Interval for idle `sealing_thread` to apply for new sealing tasks, optional, time string format
# The default is "30s", which means 30 seconds
# seal_interval = "30s"
# rpc status polling request interval, optional, time string format
# The default is "180s", which means 180 seconds
# During the sealing process, some links use the polling method to obtain non-real-time information, such as if a message is on-chain.
# This value helps to avoid overly frequent requests consuming network resources
# rpc_polling_interval = "180s"
# Whether to skip the local verification step of proof, optional, boolean format
# Default is false
# Usually only set this during testing
# ignore_proof_check = false
# Number of retries when unable to request a task from `sector_manager`, optional, number format
# default is 3
# request_task_max_retries = 3
The configuration items in sealing
usually have default items preset from experiences, which removes the need for manual configurations in most cases.
# Special configuration example
# 1. Test network, serving only specific SP
allowed_miners = [2234, 2236, 2238]
# 2. Large-scale clusters to reduce network usage
# Among the recoverable exceptions, a considerable part is caused by network jitter, increase the interval of automatic recovery and reduce the frequency of requests
recover_interval = "90s"
# Increases the interval time to reduces the request frequency
rpc_polling_interval = "300s"
# 3. Increase the possibility of self-recovery from errors during sealing tasks
# Increase the number of autorecovery attempts
max_retries = 10
# Increase the auto-recovery interval
recover_interval = "60s"
# [[sealing_thread]]
Each sealing_thread
is used to configure a worker thread for a sector. Multiple sealing_thread
configuration groups can exist in one configuration file.
The sealing_thread
will inherit the configuration in [sealing]
. Any sealing.*
in the sealing_thread
configuration will override the configuration of [sealing]
.
# Basic configuration example
[[sealing_thread]]
# Sector data directory path, required, string type
# It is recommended to use an absolute path, the data directory and the worker thread are binded in one-to-one relationship
location = "/mnt/nvme1/store"
# task type, optional, string type
# Default value is null
# All options: sealer | snapup | rebuild | unseal | wdpost, when left blank, the default is `sealer`
# plan = "snapup"
# Custom parameters of the sealing process, only applies to the current worker thread
# sealing.allowed_miners = [10123, 10124, 10125]
# sealing.allowed_sizes = ["32GiB", "64GiB"]
# sealing.enable_deals = true
# sealing.disable_cc = true
# sealing.max_retries = 3
# sealing.seal_interval = "30s"
# sealing.recover_interval = "60s"
# sealing.rpc_polling_interval = "180s"
# sealing.ignore_proof_check = false
# sealing.request_task_max_retries = 3
[[sealing_thread]]
location = "/mnt/nvme2/store"
[[sealing_thread]]
location = "/mnt/nvme3/store"
The number of sealing_thread
and the corresponding sealing data directory paths need to be planned according to your specific situation.
In order to facilitate combination and matching, each sealing_thread
can be configured with an independent sealing
sub-item, which satisfies:
- The naming, type and effect of configurable items are consistent with the common
sealing
items - Only applies to the current worker thread
- When not configured use the value in the common
sealing
item
# Special configuration example
# 1. Two worker threads, each serving a different SP
[[sealing_thread]]
location = "/mnt/nvme2/store"
sealing.allowed_miners = [1357]
[[sealing_thread]]
location = "/mnt/nvme3/store"
sealing.allowed_miners = [2468]
# 2. Two worker threads, each with different sector sizes
[[sealing_thread]]
location = "/mnt/nvme2/store"
sealing.allowed_sizes = ["32GiB"]
[[sealing_thread]]
location = "/mnt/nvme3/store"
sealing.allowed_sizes = ["64GiB"]
# sealing_thread configuration hot reload
Before damocles v0.5.0
, we could only update the configuration by modifying the sealing_thread
configuration and restarting damocles-worker.
It is very inconvenient in some scenarios, for example: during Sector rebuild (opens new window) we would like to be able to reload configuration without rebooting damocles-worker after modifying the plan
configuration item of a sealing_thread
.
sealing_thread
configuration hot reload is supported after v0.5.0
. By creating a hot reload configuration file named config.toml
under the location
directory in a sealing_thread
with definitions of the configuration file exactly the same as the content of [[sealing_thread]]
, the configuration items in this configuration file will override the corresponding [[sealing_thread]]
configuration items in damocles-worker, and modifying this configuration file does not require restarting damocles-worker to take effect.
The sealing_thread
in damocles-worker
will check the config.toml
file in the location
directory before starting a new sector task. A reload or removal of the configuration will be triggered, if the content of the config.toml
file is either changed or the file is deleted.
Notice:
- Hot reload of configuration file
config.toml
cannot override thelocation
configuration item insealing_thread
. damocles-worker
main configuration file does not support hot reload.
# Basic configuration example
# /path/to/the_sealing_thread_location/config.toml
# Task type, optional, string type
# Default value is null
# All options: sealer | snapup | rebuild | unseal | wdpost, if not filled, the default is sealer
# plan = "rebuild"
# Custom parameters of the sealing process, only effective on the current worker thread
# sealing.allowed_miners = [10123, 10124, 10125]
# sealing.allowed_sizes = ["32GiB", "64GiB"]
# sealing.enable_deals = true
# sealing.max_retries = 3
# sealing.seal_interval = "30s"
# sealing.recover_interval = "60s"
# sealing.rpc_polling_interval = "180s"
# sealing.ignore_proof_check = false
# sealing.request_task_max_retries = 3
# [[attached]]
attached
is used to configure where the sealed sector persistent data is saved. Multiple configuration entries are allowed at the same time.
# Basic configuration example
[attached]
# name, optional, string type
# The default is the absolute path corresponding to the location
# name = "remote-store1"
# path, required, string type
# It is recommended to fill in the absolute path directly
location = "/mnt/remote/10.0.0.14/store"
# read only, optional, boolean
# Default is false
# readonly = true
With the need to coordinate storage location information between damocles-worker
and damocles-manager
in mind and the fact that in many cases the same persistent storage directory mount path on the damocles-worker
machine and on the damocles-manager
machine are not exactly the same, so we decided to use name
as the basis for coordination.
You can also choose not to configure name
on both sides of damocles-worker
and damocles-manager
during configuration if the mount path of the persistent storage directory is the same on all machines. In this case, both will use the absolute path as the name
.
# [processors]
processors
is used to configure sealing task executors, and some other information during sealing.
This configuration item is actually divided into three sub-items, and we shall analyze them one by one.
# [processors.limitation.concurrent]
When both [processors.limit]
and [processors.limitation.concurrent]
exist in the configuration file, the latter shall override.
processors.limitation.concurrent
is used to configure the number of parallel tasks for a specific sealing task. This is to reduce the contention of hardware resources at one given sealing stage.
It should be noted that when external processors are configured, the number of external processors and the total allowed concurrency will also affect the number of parallel tasks.
# Basic configuration example
[processors.limitation.concurrent]
# Consurrency limit for add_pieces task, optional, number_type
# add_pieces = 5
# Consurrency limit for tree_d task, optional, number type
# tree_d = 1
# Concurrency limit for the pc1 task, optional, number type
# pc1 = 3
# Concurrency limit of the pc2 task, optional, number type
# pc2 = 2
# Concurrency limit for task c2, optional, number type
# c2 = 1
For example, if pc2 = 2
is set, then at most two sectors can perform pc2
task at the same time.
# [processors.limitation.staggered]
processors.limitation.staggered
is used to configure the time interval for staggered startup of parallel tasks during a given sealing phase. After configuring this item, when multiple tasks are scheduled at the same time for a given sealing phase, damocles-worker
will start the tasks in sequence according to the configured time interval, so as to avoid the problem of resource shortage such as slowdown of disk IO caused by the simultaneous start of tasks.
# Basic configuration example
[processors.limitation.staggered]
# The time interval for starting multiple pc1 tasks in sequence, optional, string type (e.g. "1s", "2min")
# pc1 = "5min"
# pc2 = "4min"
For example, if pc1 = "5min"
is set, when two pc1 tasks are scheduled at the same time, the second task will be executed after 5 minutes into the execution of the first task.
# [processors.ext_locks]
processors.ext_locks
is used to configure some custom lock restrictions, which is used in conjunction with the locks
configuration item in [[processors.{stage_name}]]
.
This configuration item allows users to customize some restrictions and make different external processors subject to them.
# Basic configuration example
[processors.ext_locks]
# some_name = some_number
# Special configuration example
processors.ext_locks
by itself does not apply the lock.
# one GPU, shared by pc2 and c2
[processors.ext_locks]
gpu=1
[[processors.pc2]]
locks = ["gpu"]
[[processors.c2]]
locks = ["gpu"]
By this way, pc2
c2
will each start an external processor, and the two will compete for the lock, which means that the two tasks will not happen at the same time.
# Two GPU, shared by pc2 and c2
[processors.ext_locks]
gpu1 = 1
gpu2 = 1
[[processors.pc2]]
locks = ["gpu1"]
[[processors.pc2]]
locks = ["gpu2"]
[[processors.c2]]
locks = ["gpu1"]
[[processors.c2]]
locks = ["gpu2"]
In this way, pc2
c2
will each start two external processors, which will create a two-two competition relationship, which allows to limit a GPU to only execute one of the sealing tasks.
# [processors.static_tree_d]
processors.static_tree_d
is a configuration item introduced to improve the efficiency of cc sector
.
When a static file path is configured for the corresponding sector size, this file will be used directly as the tree_d file for cc sector
without trying to generate it again.
# Basic configuration example
[processors.static_tree_d]
2KiB = "/var/tmp/2k/sc-02-data-tree-d.dat"
32GiB = "/var/tmp/32g/sc-02-data-tree-d.dat"
64GiB = "/var/tmp/64g/sc-02-data-tree-d.dat"
# [[processors.{stage_name}]]
This is the configuration group used to configure external processors.
Currently {stage_name}
could be one of the following.
add_pieces
for add pieces phasetree_d
for generation phase of Tree Dpc1
for the PreCommit1 phasepc2
for the PreCommit2 phasesynth_proof
for the synthetic proof phasec2
: for Commit2 phaseunseal
: for unseal phasetransfer
: used to customize the transfer method between local data and persistent data storage
Each such configuration group translates to an external processor of the corresponding sealing phase to be started. If nothing is configured for one of the above {stage_name}
and corresponding [[processors.{stage_name}]]
line does not exist in the configuration file,
then damocles-worker
will not start a child process for this {stage_name}
. damocles-worker
will use the built-in processor code to directly execute the corresponding {stage_name}
task in sealing_thread
, which means that the concurrent number of {stage_name}
tasks depends on the corresponding number of sealing_thread
and concurrent number of {stage_name}
as configured in [processors.limitation.concurrent]
. By not configuring external processor saves extra steps such as serializing task parameters and task output, but one loses capabilities such as more powerful concurrency control, cgroup control, and custom proof algorithms. You are feel to choose according to your own needs.
The optional configuration items for [[processors.{stage_name}]]
include:
[[processors.pc1]]
# Custom external processor executable file path, optional, string type
# By default, the executable file path corresponding to the main process will be used, Execute the damocles-worker builtin processor
# bin = "./dist/bin/damocles-worker-plugin-pc1"
# Customize the parameters of the external processor, optional item, string array type
# The default value is null, `damocles-worker`'s own executor default parameters will be used
# args = ["--args-1", "1", --"args-2", "2"]
# Time to get ready of external child-processes, optional, time string
# The default value is 5s
# stable_wait = "5s"
# numa affinity partition id, optional, number type
# Default value is null, no affinity will be set
# Need to fill in according to the host's numa partition
# numa_preferred = 0
# cpu core binding and limit options, optional, string type
# The default value is null, no binding is set
# The format of the value follows the standard cgroup.cpuset format
# cgroup.cpuset = "4-5"
# Additional environment variables for external processors, optional, dictionary type
# Default value is null
# envs = { RUST_LOG = "info" }
# The maximum number of concurrent tasks allowed by this executor
# The default value is null, unlimited, but whether the task is executed concurrently depends on the implementation of the external processor used
# Mainly used in pc1 so that multiple parallel links can be used, which can effectively save resources such as shared memory and thread pools
# concurrent = 4
# Custom external limit lock name, optional, string array type
# Default value is null
# locks = ["gpu1"]
# Whether to restart automatically when a child process exits, optional, bool type
# Default is true
# auto_restart = true
# Whether to inherit the environment variables from the worker daemon process, optional, bool type
# Default is true
# inherit_envs = true
# Basic configuration example
[processors.limit]
add_pieces = 8
pc1 = 4
pc2 = 2
c2 = 1
[[processors.pc1]]
numa_preferred = 0
cgroup.cpuset = "0-7"
concurrent = 2
envs = { FIL_PROOFS_USE_MULTICORE_SDR = "1" }
auto_restart = true
[[processors.pc1]]
numa_preferred = 1
cgroup.cpuset = "12-19"
concurrent = 2
envs = { FIL_PROOFS_USE_MULTICORE_SDR = "1" }
auto_restart = true
[[processors.pc2]]
cgroup.cpuset = "8-11,24-27"
envs = { FIL_PROOFS_USE_GPU_COLUMN_BUILDER = "1", FIL_PROOFS_USE_GPU_TREE_BUILDER = "1", CUDA_VISIBLE_DEVICES = "0" }
auto_restart = true
[[processors.pc2]]
cgroup.cpuset = "20-23,36-39"
envs = { FIL_PROOFS_USE_GPU_COLUMN_BUILDER = "1", FIL_PROOFS_USE_GPU_TREE_BUILDER = "1", CUDA_VISIBLE_DEVICES = "1" }
auto_restart = true
[[processors.c2]]
cgroup.cpuset = "28-35"
envs = { CUDA_VISIBLE_DEVICES = "2,3" }
auto_restart = true
[[processors.tree_d]]
cgroup.cpuset = "40-45"
auto_restart = true
The above is an example processors.{stage_name}
configuration for a 48C + 4GPU box. Under this configuration, it will start:
- 2
pc1
external processors, both usingMULTICORE_SDR
mode, each with 8 cores, allowing 2 concurrent tasks, and memory allocation preferentially uses its configured numa partition - 2
pc2
external processors, each with 8 cores, each using one GPU - 1
c2
external processor, allocated 8 cores, using one GPU - 1
tree_d
external processor with 6 cores allocated
# Special configuration example
# 1. Using user-defined closed source, algorithmically optimized c2 external processor
[[processors.c2]]
bin = "/usr/local/bin/damocles-worker-c2-optimized"
cgroup.cpuset = "40-47"
envs = { CUDA_VISIBLE_DEVICES = "2,3" }
# 2. c2 external processor using outsourced mode (remote computation)
[[processors.c2]]
bin = "/usr/local/bin/damocles-worker-c2-outsource"
args = ["--url", "/ip4/apis.filecoin.io/tcp/10086/https", "--timeout", "10s"]
envs = { LICENCE_PATH = "/var/tmp/c2.licence.dev" }
# 3. Use CPU mode to make up for pc2 computing power when GPU is lacking
[[processors.pc2]]
cgroup.cpuset = "8-11,24-27"
envs = { FIL_PROOFS_USE_GPU_COLUMN_BUILDER = "1", FIL_PROOFS_USE_GPU_TREE_BUILDER = "1", CUDA_VISIBLE_DEVICES = "0" }
[[processors.pc2]]
cgroup.cpuset = "20-23,36-45"
# 4. Under the optimal planning, the total amount of pc1 is odd and cannot be divided equally
[processors.limitation.concurrent]
pc1 = 29
pc2 = 2
c2 = 1
[[processors.pc1]]
numa_preferred = 0
cgroup.cpuset = "0-41"
concurrent = 14
envs = { FIL_PROOFS_USE_MULTICORE_SDR = "1" }
[[processors.pc1]]
numa_preferred = 1
cgroup.cpuset = "48-92"
concurrent = 15
envs = { FIL_PROOFS_USE_MULTICORE_SDR = "1" }
# 5. To prioritize numa 0 area for pc1
[processors.limit]
pc1 = 29
pc2 = 2
c2 = 1
[[processors.pc1]]
numa_preferred = 0
cgroup.cpuset = "0-47"
concurrent = 16
envs = { FIL_PROOFS_USE_MULTICORE_SDR = "1" }
[[processors.pc1]]
numa_preferred = 1
cgroup.cpuset = "48-86"
concurrent = 13
envs = { FIL_PROOFS_USE_MULTICORE_SDR = "1" }
# cgroup.cpuset
Configuration Notes
For PC1 external processors with multicore SDR enabled,
cgroup.cpuset
should use the entirety of the CPU cores under one L3 cache as a base unit for configuration as much as possible. If you need to configure portion of CPU cores under one L3 cache, you must make sure that the number of CPU cores under each L3 cache is the same.According to rust-fil-proofs (opens new window), when
multicore sdr
is enabled, the number of CPU cores and number of corresponding cores on single DIE must be an integer multiple relationship. If the external processor is using rust-fil-proofs or rust-fil-proofs based variants, this rule must be followed, otherwise it can be ignored. The default processor of damocles-worker is based on rust-fil-proofs.The CPU cores in
cgroup.cpuset
should not span NUMA nodes as much as possible. Across NUMA nodes will make CPU access to memory slower. (damocles
supports loading NUMA affinityhugepage
memory files (opens new window) after v0.5.0, if this feature is enabled then you can allocatecpusets
across NUMA nodes without concerns)If the configured CPU cores are on the same NUMA node,
processors.{stage_name}.numa_preferred
can be configured with the corresponding NUMA node id.
Use damocles-worker-util to check CPU information
./damocles-worker-util hwinfo
Output:
CPU topology:
Machine (503.55 GiB)
├── Package (251.57 GiB) (*** *** *** 32-Core Processor)
│ ├── NUMANode (#0 251.57 GiB)
│ ├── L3 (#0 16 MiB)
│ │ └── PU #0 + PU #1 + PU #2 + PU #3
│ ├── L3 (#1 16 MiB)
│ │ └── PU #4 + PU #5 + PU #6 + PU #7
│ ├── L3 (#2 16 MiB)
│ │ └── PU #8 + PU #9 + PU #10 + PU #11
│ ├── L3 (#3 16 MiB)
│ │ └── PU #12 + PU #13 + PU #14 + PU #15
│ ├── L3 (#4 16 MiB)
│ │ └── PU #16 + PU #17 + PU #18 + PU #19
│ ├── L3 (#5 16 MiB)
│ │ └── PU #20 + PU #21 + PU #22 + PU #23
│ ├── L3 (#6 16 MiB)
│ │ └── PU #24 + PU #25 + PU #26 + PU #27
│ └── L3 (#7 16 MiB)
│ └── PU #28 + PU #29 + PU #30 + PU #31
└── Package (251.98 GiB) (*** *** *** 32-Core Processor)
├── NUMANode (#1 251.98 GiB)
├── L3 (#8 16 MiB)
│ └── PU #32 + PU #33 + PU #34 + PU #35
├── L3 (#9 16 MiB)
│ └── PU #36 + PU #37 + PU #38 + PU #39
├── L3 (#10 16 MiB)
│ └── PU #40 + PU #41 + PU #42 + PU #43
├── L3 (#11 16 MiB)
│ └── PU #44 + PU #45 + PU #46 + PU #47
├── L3 (#12 16 MiB)
│ └── PU #48 + PU #49 + PU #50 + PU #51
├── L3 (#13 16 MiB)
│ └── PU #52 + PU #53 + PU #54 + PU #55
├── L3 (#14 16 MiB)
│ └── PU #56 + PU #57 + PU #58 + PU #59
└── L3 (#15 16 MiB)
└── PU #60 + PU #61 + PU #62 + PU #63
...
From the output information, we can see that this machine has two NUMANodes, each NUMANode has 8 L3 Caches, and each L3 cache has 4 CPU cores.
- CPU cores on NUMANode #0: 0-31
- CPU cores on NUMANode #1: 31-63
Using this machine as an example, configuration like processors.pc1.cgroup.cpuset = "0-6"
does not satisfy the rust-fil-proofs rule.
- PU#0, PU#1, PU#2, PU#3, these 4 CPU cores belong to L3#0
- PU#4, PU#5, PU#6, these 3 CPU cores belong to L3#1
The number of shared caches of is 2 (L3#0, L3#1), and the number of configured CPU cores is inconsistent, which does not meet the rust-fil-proofs
rule and cannot be started. The correct configuration could be processors.pc1.cgroup.cpuset = "0-7"
.
# A minimal working configuration file example
[sector_manager]
rpc_client.addr = "/ip4/{some_ip}/tcp/1789"
# According to actual resource planning
[[sealing_thread]]
location = "{path to sealing store1}"
[[sealing_thread]]
location = "{path to sealing store2}"
[[sealing_thread]]
location = "{path to sealing store3}"
[[sealing_thread]]
location = "{path to sealing store4}"
[[sealing_thread]]
location = "{path to sealing store5}"
[[sealing_thread]]
location = "{path to sealing store6}"
[[sealing_thread]]
location = "{path to sealing store7}"
[[sealing_thread]]
location = "{path to sealing store8}"
[remote_store]
name = "{remote store name}"
location = "{path to remote store}"
[processors.static_tree_d]
32GiB = "{path to static tree_d for 32GiB}"
64GiB = "{path to static tree_d for 64GiB}"
# According to actual resource planning
[processors.limit]
pc1 = 4
pc2 = 2
c2 = 1
[[processors.pc1]]
numa_preferred = 0
cgroup.cpuset = "0-7"
concurrent = 2
envs = { FIL_PROOFS_USE_MULTICORE_SDR = "1" }
[[processors.pc1]]
numa_preferred = 1
cgroup.cpuset = "12-19"
concurrent = 2
envs = { FIL_PROOFS_USE_MULTICORE_SDR = "1" }
[[processors.pc2]]
cgroup.cpuset = "8-11,24-27"
envs = { FIL_PROOFS_USE_GPU_COLUMN_BUILDER = "1", FIL_PROOFS_USE_GPU_TREE_BUILDER = "1", CUDA_VISIBLE_DEVICES = "0" }
[[processors.pc2]]
cgroup.cpuset = "20-23,36-39"
envs = { FIL_PROOFS_USE_GPU_COLUMN_BUILDER = "1", FIL_PROOFS_USE_GPU_TREE_BUILDER = "1", CUDA_VISIBLE_DEVICES = "1" }
[[processors.c2]]
cgroup.cpuset = "28-35"
envs = { CUDA_VISIBLE_DEVICES = "2,3" }
[[processors.tree_d]]
cgroup.cpuset = "40-45"
After planning according to the actual situation and filling in the corresponding information, the above is a minimal working configuration file that...
- Does
cc sector
only - Optimized tree_d generation for 32GiB and 64GiB sectors
- Integrated resource allocation
# References to the Venus community test cases
Reference Example 1 (opens new window) Takeaways: PC1 is precisely limited to cores, and C2 is completed by gpuproxy, which has strong scalability. The disadvantage is that the configuration is complex, and the number of tasks needs to be adjusted according to the actual hardware specs
Reference Example 2 (opens new window) Takeaways: PC2 and C2 share 1 GPU, may cause some C2 tasks be backlogged
Reference Example 3 (opens new window) Takeaways: 2 sets of PC2 share GPU resources with 2 sets of C2 respectively
Reference Example 4 (opens new window) Takeaways: Suitable for lower end machines, creates 96G of swap space on NVMe, but this may cause some tasks to be slower