Skip to content
apavlo edited this page May 30, 2011 · 2 revisions

Authors

Andy Pavlo

Implementation

The MongoDB driver can execute TPC-C in two different data modes: normalized and denormalized. This affects not only how the data is loaded, but also how the transactions execute updates. All fetch commands are written such that the minimum amount of information needed from a document is pulled from the server.

Normalized Mode

Each table is stored as a separate collection in the database. All updates occur one at a time for a single record.

Denormalized Mode

When the configuration parameter denormalize is set to true, the MongoDB driver will collapse all of the data belonging to a particular customer record into a single document in the CUSTOMER collection. The structure of this document is as follows:

CUSTOMER = {
    "C_ID":           <int>,
    "C_D_ID":         <int>,
    "C_W_ID":         <int>,
    "C_FIRST":        <str>,
    "C_MIDDLE":       <str>,
    "C_LAST":         <str>,
    "C_STREET_1":     <str>,
    "C_STREET_2":     <str>,
    "C_CITY":         <str>,
    "C_STATE":        <str>,
    "C_ZIP":          <str>,
    "C_PHONE":        <str>,
    "C_SINCE":        <datetime>,
    "C_CREDIT":       <str>,
    "C_CREDIT_LIM":   <float>,
    "C_DISCOUNT":     <float>,
    "C_BALANCE":      <float>,
    "C_YTD_PAYMENT":  <float>,
    "C_PAYMENT_CNT":  <int>,
    "C_DELIVERY_CNT": <int>,
    "C_DATA":         <str>,
    "ORDERS": [
        {
            "O_ID":         <int>,
            "O_ENTRY_D":    <datetime>,
            "O_CARRIER_ID": <int>,
            "O_OL_CNT":     <int>,
            "O_ALL_LOCAL":  <int>,
            "ORDER_LINE": [
                {
                    "OL_NUMBER":      <int>,
                    "OL_I_ID":        <int>,
                    "OL_SUPPLY_W_ID": <int>,
                    "OL_DELIVERY_D":  <datetime>,
                    "OL_QUANTITY":    <int>,
                    "OL_AMOUNT":      <float>,
                    "OL_DIST_INFO":   <str>
                }
            ]
        }
    ]
    "HISTORY": [
        {
            "H_D_ID":   <int>,
            "H_W_ID":   <int>,
            "H_DATE":   <datetime>,
            "H_AMOUNT": <float>,
            "H_DATA":   <str>
        }
    ]
}

Note that the superflous CUSTOMER information is removed from the embedded ORDERS and HISTORY documents, since the parent CUSTOMER document already has this information. Similarily, the embedded ORDER_LINE documents only contain the unique information about that record.

When using denormalized mode, the tuples for a single DISTRICT are pushed to MongoDB when loadFinishDistrict() is invoked. This is because the driver has to wait to see all of the CUSTOMER-derived data first in order to construct the monolithic document.

Driver Dependencies

Known Issues

Because MongoDB does not support multi-command transactions, running with multiple client threads may cause problems for concurrent transactions. This will limitation will cause several of the asserts to fail, which the TPC-C framework will catch, print a message, and continue running the client. Use the --debug command-line argument to get more information about these problems.

Future Work

The performance code further be improved by moving much of the transaction logic into JavaScript stored procedures. This would reduce the number of round-trips needed between the client code and the MongoDB nodes.