Skip to content
This repository has been archived by the owner on May 4, 2021. It is now read-only.

Maint 1.1 bug30905 key values count #372

Open
wants to merge 40 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
89efc2b
fix: tests: Test state file consistency
juga0 Mar 14, 2020
fe827ac
fix: state: Read file before setting key
juga0 Mar 14, 2020
c37eda5
fix: state: Let json manage data types
juga0 Mar 14, 2020
d59af7c
chg: state: Add method to count list values
juga0 Mar 14, 2020
ada98e3
chg: timestamps: Add module to manage datetime sequences
juga0 Mar 14, 2020
757ecaf
chg: json: Create custom JSON encoder/decoder
juga0 Mar 14, 2020
43a1d66
chg: state: Encode/decode datetimes
juga0 Mar 21, 2020
5da4d63
chg: resultdump: Use custom json encoder/decoder
juga0 Mar 21, 2020
836af37
chg: v3bwfile: Convert datetime to str
juga0 Mar 22, 2020
a53329a
chg: relaylist, v3bwfile: Count consensus with timestamps
juga0 Mar 21, 2020
8fe7904
chg: relaylist: Count measurements with timestamps
juga0 Mar 21, 2020
80233c9
chg: relayprioritizer: Count priorities with timestamps
juga0 Mar 21, 2020
eb68ddf
chg: resultdump: Remove `_count` from attributes
juga0 Mar 21, 2020
3f9abd5
chg: resultdump: Add missing attrs to errors
juga0 Mar 21, 2020
4b09a23
chg: tests: Remove `_count` from attr
juga0 Mar 21, 2020
6f66909
chg: v3bwfile: Count recent relay's monitoring numbers
juga0 Mar 21, 2020
13c46fa
fix: relaylist: Count recent relay's monitoring numbers
juga0 Mar 21, 2020
4c2e03c
fix: relayprioritizer: Replace call relay priority
juga0 Mar 21, 2020
1bfd036
fix: scanner: Replace call relay measurement attempt
juga0 Mar 21, 2020
3a9d1c4
fix: v3bwfile: Stop calculating failures with 0 attempts
juga0 Mar 22, 2020
6fc4542
fix: tests: Add results incrementing relays'
juga0 Mar 22, 2020
0425b70
fix: tests: Add tests loading results
juga0 Mar 22, 2020
151cfdc
fix: tests: Check the files generated in test net
juga0 Mar 22, 2020
c0811dd
chg: bwfile: Test KeyValues in a bandwidth file
juga0 Mar 22, 2020
1aadfd6
fix: doc: Explain changes in the previous commits
juga0 Mar 23, 2020
5a643ed
fix: test: Assert that caplog messages were found
juga0 Mar 23, 2020
aeec130
fix: test: Check that log prints a number
juga0 Mar 23, 2020
9bd5dfb
fixup! chg: timestamps: Add module to manage datetime sequences
juga0 Apr 6, 2020
c396557
fixup! chg: json: Create custom JSON encoder/decoder
juga0 Apr 6, 2020
e301e8b
fixup! chg: v3bwfile: Convert datetime to str
juga0 Apr 6, 2020
c0756bc
fixup! chg: relaylist, v3bwfile: Count consensus with timestamps
juga0 Apr 6, 2020
bb5560a
fixup! chg: relaylist: Count measurements with timestamps
juga0 Apr 6, 2020
63c7c9c
fixup! chg: relayprioritizer: Count priorities with timestamps
juga0 Apr 7, 2020
f1350dd
fixup! fix: relaylist: Count recent relay's monitoring numbers
juga0 Apr 7, 2020
af5bd7f
fixup! fix: tests: Add results incrementing relays'
juga0 Apr 7, 2020
01ae39a
fixup! fix: tests: Check the files generated in test net
juga0 Apr 7, 2020
22f6847
fixup! chg: bwfile: Test KeyValues in a bandwidth file
juga0 Apr 7, 2020
4927a69
fixup! fix: doc: Explain changes in the previous commits
juga0 Apr 7, 2020
c32c5ff
fixup! chg: relayprioritizer: Count priorities with timestamps
juga0 Apr 9, 2020
1a696a0
fixup! chg: bwfile: Test KeyValues in a bandwidth file
juga0 Apr 9, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 90 additions & 2 deletions docs/source/implementation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ A first solution would be to obtain the git revision at runtime, but:
the git revision of that other repository.

So next solution was to obtain the git revision at build/install time.
To achive this, an script should be call from the installer or at runtime
To achive this, an script should be called from the installer or at runtime
whenever `__version__` needs to be read.

While it could be implemented by us, there're two external tools that achive
Expand Down Expand Up @@ -95,4 +95,92 @@ git or python versions or we find a way to make `setuptools_scm` to detect
the same version at buildtime and runtime.

See `<https://github.com/MartinThoma/MartinThoma.github.io/blob/1235fcdecda4d71b42fc07bfe7db327a27e7bcde/content/2018-11-13-python-package-versions.md>`_
for other comparative versioning python packages.
for other comparative versioning python packages.


Changing Bandwidth file monitoring KeyValues
--------------------------------------------

In version 1.1.0 we added KeyValues call ``recent_X_count`` and
``relay_X_count`` which implied to modify serveral parts of the code.

We only stored numbers for simpliciy, but then the value of this numbers
accumulate over the time and there is no way to know to which number decrease
since some of the main objects are not recreated at runtime and do not have
attributes about when they were created or updated.
The relations between the object do no follow usual one-to-many or many-to-many
relationships either, to be able to induce some numbers from the related
objects.

The only way we could think to solve this is to store list of timestamps,
instead of just numbers, as an attribute in the objects that need to store
some counting.

Where the values of the keys come from?
```````````````````````````````````````

In the file system, there are only two types of files were these values can be
stored:
- the results files in ``datadir``
- the ``state.dat`` file

Because of the structure of the content in the results files, they can store
KeyValues for the relays, but not for the headers, which need to be stored in
the ``state.dat`` file.

The classes that manage these KeyValues are:

``RelayList``:

- recent_consensus_count
- recent_measurement_attempt_count

``RelayPrioritizer``:

- recent_priority_list_count
- recent_priority_relay_count

``Relay`` and ``Result``:

- relay_in_recent_consensus_count
- relay_recent_measurement_attempt_count
- relay_recent_priority_list_count

Transition from numbers to datetimes
````````````````````````````````````

The KeyValues named ``_count`` in the results and the state will be ignored
when sbws is restarted with this change, since they will be written without
``_count`` names in these files json .

We could add code to count this in the transition to this version, but these
numbers are wrong anyway and we don't think it's worth the effort since they
will be correct after 5 days and they have been wrong for long time.

Additionally ``recent_measurement_failure_count`` will be negative, since it's
calculated as ``recent_measurement_attempt_count`` minus all the results.
While the total number of results in the last 5 days is corrrect, the number of
the attempts won't be until 5 days have pass.

Disadvantages
`````````````

``sbws generate``, with 27795 measurement attempts takes 1min instead of a few
seconds.
The same happens with the ``RelayPrioritizer.best_priority``, though so far
that seems ok since it's a python generator in a thread and the measurements
start before it has calculated all the priorities.
The same happens with the ``ResultDump`` that read/write the data in a thread.

Conclussion
```````````

All these changes required lot of effort and are not optimal. It was the way
we could correct and maintain 1.1.0 version.
If a 2.0 version happens, we highly recommend re-design the data structures to
use a database using a well maintained ORM library, which will avoid the
limitations of json files, errors in data types conversions and which is
optimized for the type of counting and statistics we aim to.

.. note:: Documentation about a possible version 2.0 and the steps to change
the code from 1.X needs to be created.
19 changes: 19 additions & 0 deletions sbws/core/bwfile_health.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/usr/bin/env python3
""""""
import argparse

from sbws.lib.bwfile_health import BwFile


def main():
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file-path", help="Bandwidth file path.")

args = parser.parse_args()

header_health = BwFile.load(args.file_path)
header_health.report


if __name__ == "__main__":
main()
4 changes: 2 additions & 2 deletions sbws/core/scanner.py
Original file line number Diff line number Diff line change
Expand Up @@ -515,8 +515,8 @@ def main_loop(args, conf, controller, relay_list, circuit_builder, result_dump,
# Don't start measuring a relay if sbws is stopping.
if settings.end_event.is_set():
break
relay_list.increment_recent_measurement_attempt_count()
target.increment_relay_recent_measurement_attempt_count()
relay_list.increment_recent_measurement_attempt()
target.increment_relay_recent_measurement_attempt()
num_relays += 1
# callback and callback_err must be non-blocking
callback = result_putter(result_dump)
Expand Down
20 changes: 20 additions & 0 deletions sbws/globals.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,26 @@
# destination fail again.
FACTOR_INCREMENT_DESTINATION_RETRY = 2

# Constants to check health KeyValues in the bandwidth file
PERIOD_DAYS = int(MEASUREMENTS_PERIOD / (24 * 60 * 60))
MAX_RECENT_CONSENSUS_COUNT = PERIOD_DAYS * 24 # 120
# XXX: This was only defined in `config.default.ini`, it should be read from
# here.
FRACTION_RELAYS = 0.05
# A priority list currently takes more than 3h, ideally it should only take 1h.
MIN_HOURS_PRIORITY_LIST = 1
# As of 2020, there're less than 7000 relays.
MAX_RELAYS = 8000
# 120
MAX_RECENT_PRIORITY_LIST_COUNT = int(
PERIOD_DAYS * 24 / MIN_HOURS_PRIORITY_LIST
)
MAX_RELAYS_PER_PRIORITY_LIST = int(MAX_RELAYS * FRACTION_RELAYS) # 400
# 48000
MAX_RECENT_PRIORITY_RELAY_COUNT = (
MAX_RECENT_PRIORITY_LIST_COUNT * MAX_RELAYS_PER_PRIORITY_LIST
)


def fail_hard(*a, **kw):
''' Log something ... and then exit as fast as possible '''
Expand Down
Loading