GitHub - mutating/skelet: Collect all the settings in one place

ⓘ

Keep all your project's settings in one place. Ensure type safety, thread safety and safe secret handling. Validate values with simple and elegant Pythonic syntax. Automatically load values from config files and environment variables.

Quick start

Install it:

pip install skelet

You can also quickly try this package and others without installing them via instld.

Now let's create our first storage class. To do this, we need to inherit from the base class Storage and define fields using Field:

from skelet import Storage, Field, NonNegativeInt

class ManDescription(Storage):
    name: str = Field()
    age: NonNegativeInt = Field(validation={'You must be 18 or older to feel important': lambda x: x >= 18})

You may notice that this is very similar to dataclasses or models from Pydantic. Yes, the API is similar, but it is designed specifically for configuration management.

So, let's create an object of our class and look at it:

description = ManDescription(name='Evgeniy', age=32)
print(description)
#> ManDescription(name='Evgeniy', age=32)

The object we created is not just a container for fields. It can also validate values and check types. Let's try assigning an invalid value:

description.age = -5
#> TypeError: The value -5 (int) of the "age" field does not match the type NonNegativeInt.
description.age = 5
#> ValueError: You must be 18 or older to feel important
description.name = 3.14
#> TypeError: The value 3.14 (float) of the "name" field does not match the type str.

That is already useful, but the rest of this guide covers more advanced features.

Default values

A default value is used when no other source provides one. It will be used until you override it.

You do not have to define a default value, but in this case you need to pass the value when creating the storage object. If you do set a default value, there are two ways to do this:

Ordinary.
Lazy (deferred).

You can already see examples of ordinary default values above. Here's another one:

class UnremarkableSettingsStorage(Storage):
    ordinary_field: str = Field('I am the ordinary default value!')

print(UnremarkableSettingsStorage())
#> UnremarkableSettingsStorage(ordinary_field='I am the ordinary default value!')

You can also pass a factory function via default_factory — it will be called each time a new object is created:

class UnremarkableSettingsStorage(Storage):
    ordinary_field: str = Field(default_factory=lambda: 'I am the lazy default value!')

print(UnremarkableSettingsStorage())
#> UnremarkableSettingsStorage(ordinary_field='I am the lazy default value!')

Use this option when the default value is mutable, such as a list or dict. A new object will be created for this field every time a new storage object is created, so the same mutable object will not be shared between instances.

Documenting fields

You might be tempted to document a field with a comment:

class TheSecretFormula(Storage):
    the_secret_ingredient: str = Field()  # frogs' paws or something else nasty
    ...

Prefer the doc parameter instead:

class TheSecretFormula(Storage):
    the_secret_ingredient: str = Field(doc="frogs' paws or something else nasty")
    ...

Not only does this make the code self-documenting, the field description will also appear in exception messages:

formula = TheSecretFormula(the_secret_ingredient=13)
#> TypeError: The value 13 (int) of the "the_secret_ingredient" field (frogs' paws or something else nasty) does not match the type str.

Secret fields

Some field values should not appear in logs or string representations. Secret fields are designed for such cases:

class TopStateSecrets(Storage):
    who_killed_kennedy: str = Field('aliens', validation=lambda x: x != 'russians', secret=True)
    red_buttons_password: str = Field('1234', secret=True)

print(TopStateSecrets())
#> TopStateSecrets(who_killed_kennedy=***, red_buttons_password=***)

If you mark a field with the secret flag, as in this example, its contents will be hidden in string representations and exception messages:

secrets = TopStateSecrets()

secrets.who_killed_kennedy = 'russians'
#> ValueError: The value *** (str) of the "who_killed_kennedy" field does not match the validation.

In all other respects, "secret" fields behave the same as regular ones, you can read values and write new ones.

Type checking

Type hints are optional. When specified, all values are checked against the hint, and a TypeError is raised on mismatch:

class HumanMeasurements(Storage):
    number_of_legs: int = Field(2)
    number_of_hands: int = Field(2)

measurements = HumanMeasurements()

measurements.number_of_legs = 'two'
#> TypeError: The value 'two' (str) of the "number_of_legs" field does not match the type int.

The library supports only a runtime-checkable subset of typing constructs. Checks are based on isinstance. A few additional annotations are also supported:

Any — means the same thing as the absence of an annotation.
Union (in the old style or in the new one, using the | operator) — means logical OR between types.
Optional (again, both in the old style and in the new one — via |) — means that a value of the specified type is expected, or None.
list, dict, and tuple can be specified with the types they contain. By default, the contents of these containers are not checked, but this is done in relation to external sources.

The library deliberately does not attempt to implement full runtime type checking. If you need more powerful verification, it's better to rely on static tools like mypy.

The library also supports two additional types that allow you to narrow down the behavior of the basic int type:

NaturalNumber — as the name implies, only objects of type int greater than zero will be checked for this type.
NonNegativeInt — the same as NaturalNumber, but 0 is also a valid value.

Please note that these constraints are checked only at runtime.

Validation of values

In addition to type checking, you can specify arbitrary validation rules for field values.

The simplest way to validate a specific field is to pass a lambda function that returns a bool value as the validation argument for the field:

class ScaryNumbers(Storage):
    unlucky_number: int = Field(13, validation=lambda x: x in [13, 17, 4, 9, 40], doc='a number that is considered unlucky by a particular people')
    number_of_the_beast: int = Field(666, validation=lambda x: x in [616, 666], doc='different translations of the Bible give different numbers for the beast')

numbers = ScaryNumbers()

This function should return True if the value is valid, and False if it is not. If you try to assign an invalid value to the field, an exception will be raised:

numbers.unlucky_number = 7
#> ValueError: The value 7 (int) of the "unlucky_number" field (a number that is considered unlucky by a particular people) does not match the validation.
numbers.number_of_the_beast = 555
#> ValueError: The value 555 (int) of the "number_of_the_beast" field (different translations of the Bible give different numbers for the beast) does not match the validation.

You can also pass a dictionary as a validation parameter, where the keys are messages that will accompany the raised exceptions, and the values are the same functions that return boolean values:

class Numbers(Storage):
    zero: int = Field(0, validation={'Zero is definitely greater than your value.': lambda x: x > -1, 'Zero is definitely less than your value.': lambda x: x < 1})
    ...

numbers = Numbers()

numbers.zero = 1
#> ValueError: Zero is definitely less than your value.
numbers.zero = -1
#> ValueError: Zero is definitely greater than your value.

ⓘ If the value does not pass validation, not only will an exception be thrown, but the value will also not be saved for that field. This is similar to how constraints work in databases.

ⓘ Validation occurs after type checking, so you can be sure that types match when your validation function is called.

All values are validated, including default values. However, sometimes you may need to disable validation for default values — for example, when using sentinel values like None, MISSING, NaN, or an empty string. In this case, pass False as the validate_default argument:

class PatientsCard(Storage):
    had_rubella: bool | None = Field(
        None,
        validation=lambda x: isinstance(x, bool),
        validate_default=False,  # The default value will not be checked.
        doc='we may not know if a person has had rubella, but if we do, then either yes or no',
    )
    ...

Conflicts between fields

Sometimes, individual field values are acceptable, but certain combinations of them are impossible. For such cases, there is a separate type of value check — conflict checking. To enable it, pass a dictionary as the conflicts parameter, whose keys are the names of other fields, and whose values are functions that return bool, answering the question «is there a conflict with the value of this field?»:

class Dossier(Storage):
    name: str = Field()
    is_jew: bool | None = Field(None, doc='Jews do not eat pork')
    eats_pork: bool | None = Field(
        None,
        conflicts={'is_jew': lambda old, new, other_old, other_new: new is True and (other_old is True or other_new is True)},
    )
    ...

When a field value changes, the library checks conflict conditions and raises an exception if a conflict is found:

dossier = Dossier(name='John')

dossier.is_jew = True
dossier.eats_pork = True
#> ValueError: The new True (bool) value of the "eats_pork" field conflicts with the True (bool) value of the "is_jew" field (Jews do not eat pork).

ⓘ Conflict checking only happens after type and individual value checking. This means that only values that are guaranteed to be individually valid will be passed to your conflict checking function.

ⓘ More details on this will be provided in the section on thread safety, but here it is useful to know that mutexes for fields with specified conflict conditions are combined. This means that checking fields for conflicts is thread-safe.

The function that checks for a conflict with the value of another field takes 4 positional arguments:

The old value of the current field.
The new value of the current field.
The old value of the field with which a conflict is possible.
The new value of the field with which a conflict is possible.

But why can there be two values for the other field? By default, conflict conditions are checked when values are changed not only for the field for which they are set, but also for potentially conflicting fields:

dossier.eats_pork = True
dossier.is_jew = True
#> ValueError: The new True (bool) value of the "is_jew" field (Jews do not eat pork) conflicts with the True (bool) value of the "eats_pork" field.

Reverse checks can be disabled by passing False as the reverse_conflicts parameter:

    ...
    eats_pork: bool | None = Field(
        None,
        conflicts={'is_jew': lambda old, new, other_old, other_new: new is True and (other_old is True or other_new is True)},
        reverse_conflicts=False,  # Conflicts will now only be checked when the values of this field change, but not when other fields change.
    )
    ...

However, I do not recommend disabling reverse checks — they ensure that the contents of the fields are consistent with each other.

Sources

So far, we have discussed that fields can have default values, as well as values obtained while the program is running. However, there is a third type of value: values loaded from data sources. The library supports several data sources:

Configuration files in various formats (TOML, YAML, and JSON).
Environment variables.
Command-line arguments.

Each field value is resolved in the following order:

graph TD;
  A[Default values] --> B(Class sources) --> C(Field sources) --> D(Instance sources) --> E(The values set at runtime)

That is, values obtained from sources have higher priority than default values, but can be overwritten (unless you prohibit it) by other values at runtime.

There are three ways to specify a list of sources:

For the whole class.
For a specific field.
For a specific instance.

To specify a list of sources for the entire class, pass it to the class constructor:

from skelet import TOMLSource

class MyClass(Storage, sources=[TOMLSource('pyproject.toml', table='tool.my_tool_name')]):
    ...

Also use the sources parameter to specify a list of sources for a specific field:

class MyClass(Storage):
    some_field = Field('some_value', sources=[TOMLSource('pyproject.toml', table='tool.my_tool_name')])

You can also combine these two options by specifying one list of sources for the class as a whole and another list for a specific field. Keep in mind that in this case, the list of sources for this field will be completely rewritten. If you want this field to use both its own set of sources and the class's list of sources, specify an ellipsis at the end of the list for the field:

class MyClass(Storage, sources=[TOMLSource('pyproject.toml', table='tool.my_tool_name')]):
    some_field = Field('some_value', sources=[TOMLSource('config_for_this_field.toml'), ...])

Finally, you can specify a list of sources for a specific instance by passing it as the _sources argument when creating the object:

instance = MyClass(_sources=[TOMLSource('instance_config.toml')])

Without an ellipsis, instance-level sources completely replace both class-level and field-level sources. If you want instance-level sources to have the highest priority while still falling back to other sources, use an ellipsis:

instance = MyClass(_sources=[TOMLSource('instance_config.toml'), ...])

In this case, instance sources are checked first, and if a value is not found, the lookup falls back to the sources that the field would normally use without _sources. The fallback rules are:

If a field has no sources parameter → fallback to class-level sources directly.
If a field has sources without ... → fallback to field-level sources only (class-level sources are not included).
If a field has sources with ... → fallback to field-level sources, then class-level sources.

⚠️ This means that ... in _sources does not always reach class-level sources. If a field defines its own sources without ..., class-level sources are excluded for that field even when instance-level _sources contains ...:
class MyClass(Storage, sources=[EnvSource()]):
    # This field's sources do not include ..., so EnvSource() is unreachable for it:
    some_field = Field('default', sources=[TOMLSource('field_config.toml')])

Only list and tuple are accepted as the _sources collection type.

All values from sources are loaded when the config object is created. If a configuration file changes after the object is created, only newly created objects will reflect the change. Existing objects will retain the old values.

Each data source behaves like a mapping, and field values are looked up by field name. If no value is found in any of the sources, only then will the default value be used. The order in which the contents of the sources are checked corresponds to the order in which the sources themselves are listed. When multiple levels of sources are combined via ellipsis, instance-level sources have the highest priority, followed by field-level sources, and then class-level sources.

For any field, you can change the key used to search for its value in the sources using the alias parameter:

class MyClass(Storage, sources=[TOMLSource('pyproject.toml', table='tool.my_tool_name')]):
    some_field = Field(alias='another_key')

Values obtained from sources are validated in the same way as all others. However, type checking for collections is stricter here: the contents of lists, dictionaries, and tuples are checked in their entirety.

Read more about the available types of sources below.

Environment variables

Environment variables are a common way to provide application settings. To connect them to your class or class field, use the EnvSource class:

from skelet import EnvSource

class MyClass(Storage, sources=[EnvSource()]):
    some_field = Field('some_value')

By default, environment variables are searched for by key in the form of an attribute name, but the case is ignored. If you want to make the search case-sensitive, pass True as the case_sensitive parameter:

EnvSource(case_sensitive=True)

⚠️ On Windows, environment variables are case-insensitive, so this setting will not work.

Sometimes you may also want to “namespace” environment variables, i.e., give them an application-specific prefix. For example, you may want the value for the field_name attribute to be searched for using the prefix_ key. In this case, set the appropriate prefix:

EnvSource(prefix='prefix_')  # So, for attribute "field_name", the search will be performed by key "prefix_field_name".

Similar to the prefix, you can also specify a postfix — a piece of the key that will be added at the end:

EnvSource(postfix='_postfix')  # For attribute "field_name", the search will be performed by key "field_name_postfix".

ⓘ It is important to understand that EnvSource objects cache all environment variable values. A complete cache of all variables is created when the key is searched for the first time. Currently, there is no option to clear the cache; the object can only be replaced entirely.

Environment variables can be used to store values of only certain data types. Strings are converted to typed values based on type hints for specific fields. Here are the supported options:

str — any string can be interpreted as a str type. If you used the Any annotation for the field or did not specify annotations at all, the value will also be interpreted as a string.
int — any integers.
float — any floating-point numbers, including infinities and NaN.
bool — the strings "yes", "True", and "true" are interpreted as True, while "no", "False", or "false" are interpreted as False.
date or datetime — strings representing, respectively, dates or dates + time in ISO 8601 format.
list — lists in json format are expected.
tuple — lists in json format are expected.
dict — dicts in json format are expected.

TOML files and pyproject.toml

TOML is currently the preferred format for storing application settings in Python projects.

To read the configuration from a specific file, create a TOMLSource object by passing the file name or a Path-like object to the constructor:

from skelet import TOMLSource

class MyClass(Storage, sources=[TOMLSource('my_config.toml')]):
    ...

The TOML format supports so-called “tables” — sections of the configuration that are converted into nested dictionaries when read. By default, the top-level table is read, but you can also read one of the nested tables. To do this, use the table parameter:

TOMLSource('my_config.toml', table='first_level.second_level')  # You can also pass a list of strings instead of a dot-delimited path.

ⓘ If you are writing your own library and allowing users to configure it via a pyproject.toml file, it is generally recommended to use table tool.<your library name> for this purpose.

ⓘ All file contents are cached after the first value is read.

JSON files

JSON can also be connected as a source using the JSONSource class:

from skelet import JSONSource

class MyClass(Storage, sources=[JSONSource('my_config.json')]):
    ...

This works similarly to reading TOML files, except that tables are not supported here.

YAML files

YAML is a popular format for storing configurations. Use the YAMLSource class:

from skelet import YAMLSource

class MyClass(Storage, sources=[YAMLSource('my_config.yaml')]):
    ...

Everything will also work similarly to reading TOML files, except that tables are not supported here.

Command-line arguments

skelet can automatically parse command-line arguments. To do this, use the FixedCLISource object, to which you need to pass a list of positional and/or named command-line arguments:

#!/usr/bin/env python3
# Obviously, this is not a completed program, just a fragment of the code in it.

from skelet import FixedCLISource

class MyClass(Storage, sources=[
    FixedCLISource(
        named_arguments=['first_field', 'second_field'],
        positional_arguments=['third_field'],
    ),
]):
    first_field: str = Field('default')
    second_field: str = Field('default')
    third_field: str = Field('default')

Now we can run our script, and the arguments will automatically populate the corresponding fields of our class:

./our_script.py --first-field value "positional argument"

As you can see, named arguments are passed with two leading hyphens, like this: --, and all underscores are replaced with hyphens. If the field name consists of 1 character, only 1 hyphen should be added at the beginning.

You do not need to pass a value for a named boolean argument. Other argument types require a value, and they will be interpreted according to their type hints.

All arguments are optional, and if they are not present on the command-line, just the default value will be used. The positional arguments are filled in exactly in the order in which you listed them, and if any of them is missing, it will be interpreted as if the last one is missing. For this reason, I do not recommend defining more than one positional command-line argument.

Collecting sources

Often, you may want to use multiple settings sources together. For example, you may need to combine settings from environment variables and settings from the pyproject.toml file, with environment variables having higher priority. You can pass multiple sources manually, or use for_tool to configure them automatically:

from skelet import for_tool

class MyClass(Storage, sources=for_tool('my_tool_name')):
    ...

How does it work? This function automatically aggregates a set of sources in the following priority (the higher in the list, the higher the priority):

Environment variables with the prefix <my_tool_name>_.
Files <my_tool_name>.toml and .<my_tool_name>.toml.
Section tool.<my_tool_name> of the pyproject.toml file.
Files <my_tool_name>.yaml and .<my_tool_name>.yaml.
Files <my_tool_name>.json and .<my_tool_name>.json.

If any of these files do not exist, they will simply be ignored.

Converting values

Sometimes you need to transform values before storing them. In this case, pass the converter function as the conversion argument:

class Digits(Storage):
    my_favorite_digit: int | str = Field(
        0,
        conversion=lambda x: {
            'zero': 0,
            'one': 1,
            'two': 2,
            'three': 3,
            'four': 4,
            'five': 5,
            'six': 6,
            'seven': 7,
            'eight': 8,
            'nine': 9,
        }.get(x, x),
        validation=lambda x: x is not None and x >= 0 and x < 10,
        doc='my favorite number from 0 to 9',
    )

digits = Digits()

digits.my_favorite_digit = 'two'
print(digits.my_favorite_digit)
#> 2

ⓘ Values are fully validated (type and individual value validation) before and after conversion. If the conversion changes the type of the value, either do not use a type hint at all, or use one that includes both types.

Thread safety

All write operations are protected by mutexes by default, with individual mutexes used for each field. The library provides a limited transactional model: if a value fails type checking or other checks, it is not applied, and other threads cannot read the “incorrect” value at that time: the new value will only become available once all checks have been passed. If you specify conditions for checking conflicts between two different fields, they start using the same mutex to ensure that there are no races.

According to Amdahl's law, the benefits of program parallelization decrease dramatically as the proportion of execution time that occurs under a mutex increases. Therefore, skelet uses a mutex only for the critical operation of replacing one value with another, not for validation.

The thread-safety guarantees are covered by dedicated tests.

Callbacks for changes

You can register a callback that runs when a field changes. This only works if it was changed directly from the program code, and not, for example, by replacing the configuration file that is used as a source.

ⓘ If you assign a value to the field that is equal to the value that this field had before, the callback will not be called.

To use this, pass a function that takes 3 positional arguments:

Old field value.
New field value.
Config object.

ⓘ Be careful when accessing other fields in the config object; avoid causing a deadlock.

Example:

class MyClass(Storage):
    field: int = Field(0, action=lambda old, new, storage: print(f'{old} -> {new}'))

storage = MyClass()

storage.field = 5
#> 0 -> 5
storage.field = 55
#> 5 -> 55

ⓘ The callback will be called only if the new value passes all the checks. Callback execution is protected by the field mutex: two callbacks for the same field of the same object cannot be executed simultaneously. Thus, the callback execution is fully thread-safe.

Read-only fields

You can make individual fields read-only. To do this, pass read_only=True to the field constructor:

class EternalTruths(Storage):
    inevitability: str = Field('Two things are certain: death and taxes', read_only=True)

storage = EternalTruths()

print(storage.inevitability)
#> Two things are certain: death and taxes
storage.inevitability = 'There are a lot of unavoidable things.'
#> AttributeError: "inevitability" field is read-only.

ⓘ This restriction only applies to user code. Default values and loading values from sources will continue to function.

Serialization

You can use asdict() to convert a storage object to a standard Python dictionary, dict.

from skelet import asdict

class FlyingConfig(Storage):
    some_field: int = Field(42)

data = asdict(FlyingConfig())
print(data)
#> {'some_field': 42}

After that, you can treat the result as a regular dict, for example, convert it to JSON and send it over the network.

Name		Name	Last commit message	Last commit date
Latest commit History 514 Commits
.github		.github
docs/assets		docs/assets
skelet		skelet
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements_dev.txt		requirements_dev.txt
thread-safety-tests-critique-9.md		thread-safety-tests-critique-9.md
thread-safety-tests.md		thread-safety-tests.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of contents

Quick start

Default values

Documenting fields

Secret fields

Type checking

Validation of values

Conflicts between fields

Sources

Environment variables

TOML files and pyproject.toml

JSON files

YAML files

Command-line arguments

Collecting sources

Converting values

Thread safety

Callbacks for changes

Read-only fields

Serialization

About

Uh oh!

Releases 17

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Table of contents

Quick start

Default values

Documenting fields

Secret fields

Type checking

Validation of values

Conflicts between fields

Sources

Environment variables

TOML files and pyproject.toml

JSON files

YAML files

Command-line arguments

Collecting sources

Converting values

Thread safety

Callbacks for changes

Read-only fields

Serialization

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 17

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages