Factorization software using Quadratic Sieve

This general-purpose integer factorization software is released into the public domain. It is provided "as is" without any warranty, express or implied.

Your Pure C99 Implementation

is immediately compatible with Microsoft Windows and Linux.
is thread-safe and simply has no dependencies, not even math.h.
use Pollard's Rho algorithm for factoring 64-bit integers.
comes with a Big Number library for handling large numbers.
includes a Self-Initializing Quadratic Sieve (SIQS).
use AVL trees to assist the Quadratic Sieve in data management.
is capable of batch factor numbers with JSON or CSV output.
has been tested and produces reproducible results.

The Factorization Manager

reads numbers to be factored from either a file or the command line.
performs preliminary checks (e.g., trial division, perfect power) before invoking the algorithms.
verifies the factorization (the product of factors should be equal to the input number).
returns a 0 exit code when all numbers are fully factored, non-zero otherwise.

Using the Software

The factor.exe file is available for Windows users. Otherwise, to get an executable, follow these steps:

Ubuntu: Install a C compiler with the command: sudo apt install build-essential.
Windows: Install MinGW, which will provide a GCC compiler similar to Ubuntu.

Once you have the compiler, use Terminal on Ubuntu or PowerShell on Windows to navigate to the downloaded directory using the cd command:

Compile the program by executing: gcc -Wall -pedantic -O3 main.c -o factor. GCC will create the appropriate executable for your device.

Compilation should take just a few seconds, and then you can use the software:

Run the executable with a command like: ./factor 2840222005804112554242019945562782034485263657432907

The software will quickly display the factors:

Number: 2840222005804112554242019945562782034485263657432907
Factors: 186565459987 (prime), 123384476380949634131^2

Command Line Options

These options are designed to facilitate your interaction with the software:

Name	Short	Goal	Example
`--input-file`	`-i`	Factor the entire file.	`-i input.txt`
`--output-file`	`-o`	Write results to a file.	`-o output.txt`
`--timeout`	`-t`	Interrupt the Quadratic Sieve after a specified duration in seconds.	`-t 1`
`--force`	`-f`	Override the default limits: 8191 digits for a number and 220-bit for the Quadratic Sieve.	`-f`
`--verbose`	`-v`	Show more or fewer details, such as Quadratic Sieve progress, depending on the verbosity level. The default value is 1.	• `-v 0` • `-v 1` • `-v 2` • `-v 3` • `-v 4`
`--output-csv`	`-c`	Format results as CSV.	`-c`
`--output-json`	`-j`	Format results as JSON.	`-j`
`--output-json-compact`	`-J`	Same as `--output-json`, but disable the pretty print.	`-J`
`--output-sql`	`-sql`	Format results as SQL.	`-sql`
`--rand-seed`	`-r`	Specify a custom XorShift random seed (64-bit).	`-r 123`
`--generate`	`-g`	Designed for testing, this option create a `generated.txt` suitable as input file, containing non-trivial sample numbers. Its parameters, all optional, are `-g <bits-min> <bits-max> <count>`.	• `-g` • `-g 150` • `-g 130 150` • `-g 150 150 1000`
`--help`	`-h`	Print the help.	`-h`

Quadratic sieve parameters

You can enforce parameters to the Quadratic Sieve :

Quadratic Sieve option	Typical value range
`--qs-multiplier <value>`	from `1` to `150`
`--qs-base-size <value>`	from `1000` to `50000`
`--qs-large-prime <value>`	from `300000` to `30000000`
`--qs-alloc-mb <value>`	from `1` to a few hundred
`--qs-sieve <value>`	`31744` by default
`--qs-threshold <value>`	from `60` to `110`
`--qs-error-bits <value>`	from `15` to `35`
`--qs-laziness <value>`	from `80` to `120` (in percentage)

Navigate through the source code to see their default value and usage.

Speed of the factorizer

The software is designed to be a general-purpose factorizer for integers, also supporting negative numbers. The algorithm can factor large numbers (e.g., a 1500-bit highly composite number or 10,000!), but performance can vary significantly depending on the structure of the number. For every additional 3 decimal digits (or 10 bits) to the input number, the Quadratic Sieve duration roughly doubles.

Default target performance
- The software can factor 170-bit RSA integers in a second.
- The software can factor 220-bit RSA integers in approximately 30 seconds.
Performance with basic tuning (sometimes without tuning)
- Successfully factors RSA numbers greater than 250 bits.
- Factored a 300-bit RSA number in 2 hours.
- Factored the 321-bit RSA number related to the bank card case.
- Factored RSA-100, a 330-bit number.

Testing the Software

You can use the built-in generator to easily produce various reproducible samples (consistent across all platforms) in a generated.txt file. These examples are for Ubuntu:

Example of a testing command that factor 10 numbers of 150-bit based on the seed 1, taking a few seconds:

./factor --generate 150 150 10 --rand-seed 1
./factor -i generated.txt -c -v 2 && echo "All the numbers have been fully factored."

You can use --rand-seed 2 to get another set of sample numbers.
In Ubuntu, use --rand-seed $(date +%s%N) to systematically get another set of sample numbers.
In Windows the && operator isn't always supported, and the factorizer often has a .exe extension.
When the software returns a 0 exit code, it means that all numbers have been fully factored.

Example of a verbose testing command that factor 1189625923578446633052664796204789022410184682730580575561447, a 200-bit number in 15 seconds :

./factor --generate 200
./factor -i generated.txt -v 4 && echo "The number have been fully factored."

Example of a testing command that factor sample numbers ranging from 120-bit to 170-bit, taking 20 seconds:

./factor --generate 120 170
./factor -i generated.txt -v 2 && echo "All the numbers have been fully factored."

Example of a testing command that shows in a minute that the Quadratic Sieve selects a good multiplier (better than 1 meaning no multiplier), which speeds up the average execution time by a factor 2:

./factor --generate 170 170 10
./factor -v 2 -c -i generated.txt
./factor -v 2 -c -i generated.txt --qs-multiplier 1

Example of the testing command that generated the provided .csv and .json samples of output files. To give you an idea of the environment, it took 11 minutes per file on the machine that tested the software:

./factor --generate
./factor --input-file generated.txt --output-file example-result.csv --output-csv
./factor --input-file generated.txt --output-file example-result.json --output-json

Example of a quick testing command based on the OEIS list of highly composite numbers:

wget -qO- https://oeis.org/A002182/b002182.txt | cut -d' ' -f 2 > oeis-list.txt
./factor -i oeis-list.txt -c && echo "All the numbers have been fully factored."

How Does the Factorization Manager Operate?

it reads options from the command line.
it prepares a session capable of handling the largest number.
it recursively factors each number using the available algorithms.
it reuses the session to factor all numbers.

Before starting, the input integers are parsed; if an error is detected, the factorization will not begin, and you will be notified via the standard error output. The software commit the output after each factorization by clearing buffers.

Memory Management

The manager allocates a few memory (a session) based on the largest number.
The Quadratic Sieve manages its memory independently, based on its input.
Memory used by the Quadratic Sieve is freed as soon as it has factored its input.
The manager reuse its session, the memory is feeed once all numbers are factored.
Valgrind was used to check for memory leaks, no leaks were found.

Memory allocations are relatively small (typically a few dozen megabytes) and are passed to assert. In case of memory allocation errors (e.g., insufficient available memory), the software will exit, and it is recommended to restart your device.

Big Numbers

The cint library (released "as is", into the public domain, without any warranty, express or implied) is provided for handling large integers. It includes a variety of basic and advanced mathematical functions to support your calculations.

AVL Binary Search Trees

The avl functions (released "as is", into the public domain, without any warranty, express or implied) manage thousands of integers in AVL trees, making Quadratic Sieve relations storage and retrieval efficient and straightforward.

Block Lanczos

The Block Lanczos algorithm (released "as is", into the public domain, without any warranty, express or implied) solve a large, sparse system of linear equations, and identify dependencies among the rows of the matrix formed by the relations identified by the Quadratic Sieve.

Files

Your software's source code is organized into the following files:

Filename	Purpose
`64-bits-factorization.c`	Contains functions for factoring classical integers, including Pollard's Rho algorithm and a Miller-Rabin primality checker.
`avl-trees.c`	Contains the AVL Trees functions, originally a separate project from the integer factorization project.
`basic-math.c`	Includes a prime number generator capable of yielding the first 326,983 prime numbers, along with the Kronecker symbol function, Tonelli-Shanks function, and modular inverse function.
`big-num.c`	Contains the `cint` library, also initially a separate project from the integer factorization project.
`block-lanczos.c`	Provides the linear algebra part of the Quadratic Sieve, including a matrix reducer and a manager for this task.
`headers.h`	Contains definitions for types, structures, functions, and `DEBUG` macros used throughout the project.
`main.c`	Contains the program's entry point and transfers control to the factorization manager after about 30 lines.
`manager.c`	Includes I/O functions, help documentation, shortcuts commonly used by the Quadratic Sieve, and the factorization manager source code.
`quadratic-sieve.c`	Contains the Quadratic Sieve implementation, divided into around thirty functions, with a main function that manages the factorization process in about 30 lines. Search for "debug" in this file to find the debugging elements displayed.

The software :

uses a stateful approach and is thread-safe (no global variables are used).
uses reproductible XorShift random values that depend on the --rand-seed (default to 0).
produces reproducible results regardless of the order in which the factorizations are batched.

The file quadratic-sieve.c

The Quadratic Sieve file structure is as follows:

the ~40 lines function that allows to see the algorithm structure
the 2 important loop conditions
the algorithm parameters
the functions approximately in the order they are called

preparation_part_2 .. 3

The input N is duplicated and called "kN" after this function complete :

multiply N by a prime number to reach 120 bits
apply a multiplier to N intended to optimize runtime

The preparation_part_3_default_proposition function provide a multiplier that make the algorithm completes faster on average. A preparation_part_3_alternative_proposition is also provided for the same purpose but is unused.

Variable	Information
N	Prime factors are removed from N, and N is updated until `N = 1`.
kN	The algorithm computes with kN, which remains constant.

See of how multipliers impact the execution of ./factor 10146246660331135529849066552891483667878618078493758287:

Option used	Makes the algorithm factoring	Comment on the multiplier	Took
`--qs-multiplier 7`	`kN` = `7` * `N`	Selected by the algorithm	3.1 s
`--qs-multiplier 1`	`kN` = `1` * `N`	No multiplier	6.3 s
`--qs-multiplier 113`	`kN` = `113` * `N`	Large multipliers are sometimes good	8.2 s
`--qs-multiplier 29`	`kN` = `29` * `N`	One of the worst	19.4 s
`--qs-multiplier 35`	`kN` = `35` * `N`	One of the worst	26.3 s

While the proposed method doesn't always select the absolute best multiplier, it finds a balance that improves average runtime by a factor 2. The best multiplier can be up to 10 times faster than the worst. Can you guess which multiplier achieves this ? For those interested, the Knuth-Schroeppel analysis explores why choosing the right multiplier makes such a significant difference (Learn more).

qs_parametrize, preparation_part_4

Good parameters can improve the speed.

define the algorithm parameters
allocates a block of memory for the quadratic sieve computations
prepare constants, variables, buffers, data arrays ...

There is a struct inside the qs_sheet called mem :

it holds the base entry point of the allocated memory
it holds a now void* pointer which represent the current available memory

qs_sheet holds 3 AVL tree manager :

one to store the regular relations
one to store the relations that wait to be paired (partials)
one to store the known divisors of N

With small precautions you are supposed to be able to store anything in now, then to update now accordingly to what you stored. now is always supposed to contain only zero until its end. This is the main memory management technique used by this software. mem_align aims to provide aligned pointers, wasting a few bits if necessary.

preparation_part_5 .. 6

Fill the manager's base array with prime numbers provided by a constant expression :

verify that kN is a square mod prime, ignore it otherwise
associates the prime with square root of kN mod prime
associates the prime with its size (log2)
computes invariants like D used to generate the A polynomial coefficient

get_started_iteration

can restore the relations previously saved by Lanczos algorithm
can be used to perform analysis, save/restore factorization to file or other periodic actions

Polynomial `AX^2 + 2BX + C` coefficients :

coefficient	is a constant after	is a constant until
A	`iteration_part_1`	`inner_continuation_condition` completion
B	`iteration_part_4`	for loop final expression
C	`iteration_part_6`	for loop final expression

Polynomial and related data are computed until iteration_part_6 complete
The algorithm prepares data that will help generate polynomial values that will be multiples of the factor base
iteration_part_7 and iteration_part_8 are used for sieving

Search sieve for relations

register_relations searches sieve for relations, it reads the sieve. When the function finds interesting to define the variable X , calculates the value of the polynomial in X then divides the value with the factor base, it thus tries to establish relations.

Buffered knowledge is structured by register_regular_relation and register_partial_relation :

informations potentially useful are saved into a struct qs_relation*
register_regular_relation immediately build a matrix of regular full qs_relation* using AVL tree
register_partial_relation combine (single large primes variation) qs_relation* together using AVL tree

The AVL trees are used to identify duplicates and retrieve their associated data.

inner_continuation_condition

Decides if sieving shoud continue or break, the relation counter is usually the condition.

lanczos_block

This algorithm find a subset of all exponent vectors such that the sum of their exponent vectors is the zero vector :

the process need memory, block-lanczos.c have its own array builder
all memory taken in mem.now is zeroed and reusable after calculations
finding the matrix eigenvalues is usually fast, it can still take 400+ iterations with large N
the algorithm is probabilistic with high success rate, generally it is lucky enough
several tries with and without reduce_matrix are performed before giving up

The reduce_matrix function helps lanczos_block and could be called unconditionally.

finalization_part_1 .. 2

The finalization_part_1 function performs the square root + GCD step based on the lanczos_block answer, it leads to a factorization of N.

qs_register_divisor

The qs_register_divisor function :

perform checks (such as primality, perfect power and GCD between known divisors)
register the appropriate divisors (prime factors of N) to the caller's routine

outer_continuation_condition

This function is used to decide if the algorithm should return the control, or return to sieving.

normal case is to return the control, answer was found
unusual case is when N isn't fully factored (maybe parameters was wrong)

So before giving up, the algorithm searches 10%, 25% then 50% more relations.

Large Prime Variation

In short: Given two partial relations with the same large prime, the register_partial_relation function merges them to obtain a full relation.

This step is optional for completing a factorization but can significantly speed up the process. To disable this feature, simply avoid calling the register_partial_relation function.

During the execution of register_relations, after trial division by the factor base primes (cint_remove), if the quotient remains greater than 1 but is less than the large prime bound, the software registers this partial relation with register_partial_relation. This function stores the new qs_relation in a separate location from full relations.

When the large prime collected matches a prime from another partial relation (detected via the AVL tree), the software combines these two relations and then calls the register_regular_relation function, which registers all regular full relations. At this point, the newly created relation is treated like any other full relation.

N bits	Relations added immediately	Relations added by single large primes variation	Relations needs
150	2551	3	2554
170	3168	659	3827
190	3212	1786	4998
210	3409	3012	6421
230	5659	4686	10345
250	10051	9793	19844
270	14026	15484	29510
290	12820	20201	33021

Online Tools

Dario Alpern's Integer Factorization Calculator: This tool processes the factorization directly in your browser.
Number Empire Factoring Calculator: This tool performs factorization on the server.

Thank you

There are many people to thank :

Carl Pomerance
Jason Papadopoulos
William Hart
my professors at University of Franche-Comté ❤️
GitHub and SourceForge users reporting issues

Any issue ?

You can report any issues, and we'll investigate to correct the source files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Factorization software using Quadratic Sieve

Your Pure C99 Implementation

The Factorization Manager

Using the Software

Command Line Options

Quadratic sieve parameters

Speed of the factorizer

Testing the Software

How Does the Factorization Manager Operate?

Memory Management

Big Numbers

AVL Binary Search Trees

Block Lanczos

Files

The file quadratic-sieve.c

preparation_part_2 .. 3

qs_parametrize, preparation_part_4

preparation_part_5 .. 6

get_started_iteration

Polynomial `AX^2 + 2BX + C` coefficients :

Search sieve for relations

inner_continuation_condition

lanczos_block

finalization_part_1 .. 2

qs_register_divisor

outer_continuation_condition

Large Prime Variation

Online Tools

Thank you

Any issue ?

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
old-version		old-version
64-bits-factorization.c		64-bits-factorization.c
CMakeLists.txt		CMakeLists.txt
README.md		README.md
avl-trees.c		avl-trees.c
basic-math.c		basic-math.c
big-num.c		big-num.c
block-lanczos.c		block-lanczos.c
example-result.csv		example-result.csv
example-result.json		example-result.json
factor.exe		factor.exe
headers.h		headers.h
main.c		main.c
manager.c		manager.c
quadratic-sieve.c		quadratic-sieve.c

michel-leonard/C-Quadratic-Sieve

Folders and files

Latest commit

History

Repository files navigation

Factorization software using Quadratic Sieve

Your Pure C99 Implementation

The Factorization Manager

Using the Software

Command Line Options

Quadratic sieve parameters

Speed of the factorizer

Testing the Software

How Does the Factorization Manager Operate?

Memory Management

Big Numbers

AVL Binary Search Trees

Block Lanczos

Files

The file quadratic-sieve.c

preparation_part_2 .. 3

qs_parametrize, preparation_part_4

preparation_part_5 .. 6

get_started_iteration

Polynomial AX^2 + 2BX + C coefficients :

Search sieve for relations

inner_continuation_condition

lanczos_block

finalization_part_1 .. 2

qs_register_divisor

outer_continuation_condition

Large Prime Variation

Online Tools

Thank you

Any issue ?

About

Topics

Resources

Stars

Watchers

Forks

Languages

Polynomial `AX^2 + 2BX + C` coefficients :