Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP ]Intrusive shamap inner final #5152

Draft
wants to merge 7 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,138 changes: 1,138 additions & 0 deletions Builds/CMake/RippledCore.cmake

Large diffs are not rendered by default.

12 changes: 12 additions & 0 deletions draft_pr_remove_me/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
This directory will be removed when the PR is no long a "draft" pr. It contains some plots to compare the memory usage in this patch with the tip of develop as well as some scripts to create those plots. The scripts are tailored to my own setup, but should be straightforward to modify for other people.

The script `memstats2` collects memory statistics on all running rippled processes, and outputs a file with a head that includes information about the processes and data with columns for: process id, time (in seconds), size of resident memory (in gigabytes), inner node counts, and treenode cache size. The script assumes it is running on a linux system (it uses the `/proc` filesystem) and assumes the `jq` program is available (to parse json responses from rippled).

The script `memstats.py` is a python program to create plots from the data collected from the `memstat2` script.

The remaining files are the plots created with `memstats.py` from an overnight run. Two rippleds were started simultaneously on my development machine and used identical config files (other than the location of the database files). The script `memstats2` was started to collect data. After a 12 hour run, the script `memstats.py` was run to create the plots. There are three plots:

mem_usage_intrusive_ptr.png - This shows the size of resident memory for the two running rippled's.
mem_diff_intrusive_ptr.png - This shows the difference between the size of resident memory between the two running rippled (this shows the memory saving in gigabytes).
mem_percent_change.png - This shows the percent memory savings i.e. `100*(old_code_size-new_code_size)/old_code_size`

126 changes: 126 additions & 0 deletions draft_pr_remove_me/memstats.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Mon Oct 23 13:18:42 2023

@author: swd

Analyze the memstats2 output
"""

import pandas as pd
import seaborn as sns
from scipy.signal import medfilt
import matplotlib.pyplot as plt

data_dir= "/home/swd/memstats/data/nov_5"

def raw_file_to_df(data_file_name):
n_to_skip = 0
mine_pid = 0
with open(data_file_name, 'r') as file:
for line in file:
if (line.startswith('pid')):
break
n_to_skip += 1
if 'projs/ripple/mine' in line:
mine_pid = int(line.split()[0])
df = pd.read_csv(data_file_name, header=0,
delimiter=r'\s+', skiprows=n_to_skip)
df['branch'] = 'dev'
df.loc[df['pid'] == mine_pid, 'branch'] = 'intr_ptr'
df['uptime_min'] = (df['time'] - df['time'].iloc[0])/60.0
df['uptime_hr'] = df['uptime_min']/60

return df


def get_timescale(df):
if df['uptime_hr'].iloc[-1] < 5:
return 'uptime_min', 'min'
return 'uptime_hr', 'hrs'


def plot_df(df, ignore_min=30):

x_col, units = get_timescale(df)
y_col = 'res_gb'

sns.set(style="whitegrid")
sns.relplot(kind='line', data=df[df['uptime_min']
> ignore_min], x=x_col, y=y_col, hue='branch')

plt.xlabel(f'Up Time ({units})')
plt.ylabel('Resident (gb)')
plt.title('Memory Usage Intrusive Pointer')
plt.subplots_adjust(top=0.9)
plt.savefig(f"{data_dir}/mem_usage_intrusive_ptr.png")
plt.show()


def plot_diff(df, filtered=True, ignore_min=30):
x_col, units = get_timescale(df)
diff = pd.DataFrame()
diff['diff'] = df[df['branch'] == 'dev']['res_gb'].values - \
df[df['branch'] == 'intr_ptr']['res_gb'].values
if filtered:
window_size = 11
diff['filtered_diff'] = medfilt(diff['diff'], kernel_size=window_size)
diff['uptime'] = df[df['branch'] == 'dev'][x_col].values
diff['uptime_min'] = df[df['branch'] == 'dev']['uptime_min'].values
y_column = 'diff' if not filtered else 'filtered_diff'

sns.set(style="whitegrid")
sns.relplot(
kind='line', data=diff[diff['uptime_min'] > ignore_min], x='uptime', y=y_column)

plt.xlabel(f'Up Time ({units})')
plt.ylabel('Delta (gb)')
title = 'Memory Difference Intrusive Pointer'
if filtered:
title += ' (filtered)'
plt.title(title)

plt.subplots_adjust(top=0.9)
plt.savefig(f"{data_dir}/mem_diff_intrusive_ptr.png")
plt.show()


def plot_percent_change(df, filtered=True, ignore_min=30):
x_col, units = get_timescale(df)
diff = pd.DataFrame()
col_name = '% change'
diff[col_name] = 100*(df[df['branch'] == 'dev']['res_gb'].values -
df[df['branch'] == 'intr_ptr']['res_gb'].values) / \
df[df['branch'] == 'dev']['res_gb'].values
if filtered:
window_size = 11
diff['filtered_' +
col_name] = medfilt(diff[col_name], kernel_size=window_size)
diff['uptime'] = df[df['branch'] == 'dev'][x_col].values
diff['uptime_min'] = df[df['branch'] == 'dev']['uptime_min'].values
y_column = col_name
if filtered:
y_column = 'filtered_'+col_name

sns.set(style="whitegrid")
sns.relplot(
kind='line', data=diff[diff['uptime_min'] > ignore_min], x='uptime', y=y_column)

plt.xlabel(f'Up Time ({units})')
plt.ylabel('% Change (delta/old)')
title = 'Percent Change Memory Intrusive Pointer'
if filtered:
title += ' (filtered)'
plt.title(title)

plt.subplots_adjust(top=0.9)
plt.savefig(f"{data_dir}/mem_percent_change.png")
plt.show()

def doit():
data_file_name = f"{data_dir}/data.raw"
df = raw_file_to_df(data_file_name)
plot_df(df)
plot_diff(df, filtered=True)
plot_percent_change(df, filtered=True)
73 changes: 73 additions & 0 deletions draft_pr_remove_me/memstats2
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
#!/usr/bin/env zsh

while getopts ":p:o:" opt; do
case $opt in
# pids is an array
p) pids=(${OPTARG})
;;
o) out=${OPTARG}
;;
\?)
;;
esac
done

if [[ -z $pid ]]; then
# pid is an array
pids=($(pidof rippled))
fi

if [[ -z $out ]]; then
echo "Must specify output file"
exit 1
fi

get_config(){
# param is the process id
for i in $(cat /proc/${1}/cmdline | tr '\0' '\n'); do
if [[ $i == *.cfg ]]; then
echo $i
return 0
fi
done

echo "Could not parse config file. Exiting" >&2
exit 1
}

page_size=$(getconf -a | grep PAGE_SIZE | awk '{print $2}')

echo > ${out}
echo >> ${out}
echo "Page size: " ${page_size} >> ${out}

for pid in $pids[@]; do
printf "%-7d %s\n" ${pid} $(get_config ${pid}) >> ${out}
cmdline=$(tr '\0' ' ' < "/proc/${pid}/cmdline")
printf "%-7d %s\n" ${pid} ${cmdline} >> ${out}
exe=$(ls -l /proc/${pid}/exe)
printf "%-7d %s\n\n" ${pid} ${exe} >> ${out}
done

echo "\npid time res_gb inner_node_counts treenode_cache_size" >> ${out}
while true; do
for pid in $pids[@]; do
if [[ ! -f /proc/${pid}/statm ]]; then
exit 1
fi
config=$(get_config ${pid})
# Set the vars in the to_set collection to each line returned by the command in turn.
to_set=(innerCount cacheSize)
for i in $(/proc/${pid}/exe --conf ${config} -- get_counts 2>/dev/null \
| jq '.result | ."ripple::SHAMapInnerNode",.treenode_cache_size'); do
eval $to_set[1]=$i
shift to_set
done

pages=$(cat /proc/${pid}/statm | awk '{print $2}')
gig=1073741824.0
printf "%-7d %11s %8.3f %-.9d %-.9d\n" ${pid} $(date "+%s") $((pages*page_size/gig)) ${innerCount} ${cacheSize} >> ${out}
echo $(tail -1 ${out})
done
sleep 30
done
12 changes: 12 additions & 0 deletions draft_pr_remove_me/notes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
The lock-free part of this code needs to be carefully audited. But note the memory savings do not depend on the lock-free nature. We can re-add the locks and still get the memory savings.

Making the inner node lock free may actually be _slower_ then using locks. There are operations that now use atomics that didn't use atomics before. This can be addressed by either re-adding the locks or potentially modifying this patch so only code that used to be under locks use atomic operations, and code that was not under locks do not use atomics.

There is a total of 32-bits used for pointer accounting information: 16-bits for the strong count; 14-bits for the weak count, 1 bit if "partial delete" was started, and 1 bit if "partial delete" has finished. I need to audit the code to make sure these limits are reasonable. The total bits can easily be bumped up to 64 without _too_ much impact on the memory savings if needed. We can also reallocate more bits to the strong count and less bits to the weak count if an audit shows that is better (I suspect there are far fewer weak pointers than strong pointers).

Much of the complication of the intrusive pointer comes from supporting weak pointers (needed for the tagged cache). If we removed weak pointers from the tagged cache this design gets much simpler.

This code will need substantially more testing (the unit tests are minimal right now). Unit tests that exercise various threading scenarios are particular important.

Note that this same technique can be used for other objects current kept in tagged caches. After we do this for inner nodes (this patch) we should see if it makes sense to do it for other objects as well.

Loading