Skip to content

Commit

Permalink
✨ add support for business cards, delivery notes, indian passport & u…
Browse files Browse the repository at this point in the history
…pdate resume (#272)
  • Loading branch information
sebastianMindee authored Nov 14, 2024
1 parent 52f0f46 commit ea95dfd
Show file tree
Hide file tree
Showing 35 changed files with 1,401 additions and 41 deletions.
16 changes: 16 additions & 0 deletions docs/extras/code_samples/business_card_v1_async.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
from mindee import Client, product, AsyncPredictResponse

# Init a new client
mindee_client = Client(api_key="my-api-key")

# Load a file from disk
input_doc = mindee_client.source_from_path("/path/to/the/file.ext")

# Load a file from disk and enqueue it.
result: AsyncPredictResponse = mindee_client.enqueue_and_parse(
product.BusinessCardV1,
input_doc,
)

# Print a brief summary of the parsed data
print(result.document)
16 changes: 16 additions & 0 deletions docs/extras/code_samples/delivery_notes_v1_async.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
from mindee import Client, product, AsyncPredictResponse

# Init a new client
mindee_client = Client(api_key="my-api-key")

# Load a file from disk
input_doc = mindee_client.source_from_path("/path/to/the/file.ext")

# Load a file from disk and enqueue it.
result: AsyncPredictResponse = mindee_client.enqueue_and_parse(
product.DeliveryNoteV1,
input_doc,
)

# Print a brief summary of the parsed data
print(result.document)
16 changes: 16 additions & 0 deletions docs/extras/code_samples/ind_passport_v1_async.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
from mindee import Client, product, AsyncPredictResponse

# Init a new client
mindee_client = Client(api_key="my-api-key")

# Load a file from disk
input_doc = mindee_client.source_from_path("/path/to/the/file.ext")

# Load a file from disk and enqueue it.
result: AsyncPredictResponse = mindee_client.enqueue_and_parse(
product.ind.IndianPassportV1,
input_doc,
)

# Print a brief summary of the parsed data
print(result.document)
167 changes: 167 additions & 0 deletions docs/extras/guide/business_card_v1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
---
title: Business Card OCR Python
category: 622b805aaec68102ea7fcbc2
slug: python-business-card-ocr
parentDoc: 609808f773b0b90051d839de
---
The Python OCR SDK supports the [Business Card API](https://platform.mindee.com/mindee/business_card).

Using the [sample below](https://github.com/mindee/client-lib-test-data/blob/main/products/business_card/default_sample.jpg), we are going to illustrate how to extract the data that we want using the OCR SDK.
![Business Card sample](https://github.com/mindee/client-lib-test-data/blob/main/products/business_card/default_sample.jpg?raw=true)

# Quick-Start
```py
from mindee import Client, product, AsyncPredictResponse

# Init a new client
mindee_client = Client(api_key="my-api-key")

# Load a file from disk
input_doc = mindee_client.source_from_path("/path/to/the/file.ext")

# Load a file from disk and enqueue it.
result: AsyncPredictResponse = mindee_client.enqueue_and_parse(
product.BusinessCardV1,
input_doc,
)

# Print a brief summary of the parsed data
print(result.document)

```

**Output (RST):**
```rst
########
Document
########
:Mindee ID: 6f9a261f-7609-4687-9af0-46a45156566e
:Filename: default_sample.jpg
Inference
#########
:Product: mindee/business_card v1.0
:Rotation applied: Yes
Prediction
==========
:Firstname: Andrew
:Lastname: Morin
:Job Title: Founder & CEO
:Company: RemoteGlobal
:Email: [email protected]
:Phone Number: +14015555555
:Mobile Number: +13015555555
:Fax Number: +14015555556
:Address: 178 Main Avenue, Providence, RI 02111
:Website: www.remoteglobalconsulting.com
:Social Media: https://www.linkedin.com/in/johndoe
https://twitter.com/johndoe
```

# Field Types
## Standard Fields
These fields are generic and used in several products.

### BaseField
Each prediction object contains a set of fields that inherit from the generic `BaseField` class.
A typical `BaseField` object will have the following attributes:

* **value** (`Union[float, str]`): corresponds to the field value. Can be `None` if no value was extracted.
* **confidence** (`float`): the confidence score of the field prediction.
* **bounding_box** (`[Point, Point, Point, Point]`): contains exactly 4 relative vertices (points) coordinates of a right rectangle containing the field in the document.
* **polygon** (`List[Point]`): contains the relative vertices coordinates (`Point`) of a polygon containing the field in the image.
* **page_id** (`int`): the ID of the page, always `None` when at document-level.
* **reconstructed** (`bool`): indicates whether an object was reconstructed (not extracted as the API gave it).

> **Note:** A `Point` simply refers to a List of two numbers (`[float, float]`).

Aside from the previous attributes, all basic fields have access to a custom `__str__` method that can be used to print their value as a string.

### StringField
The text field `StringField` only has one constraint: its **value** is an `Optional[str]`.

# Attributes
The following fields are extracted for Business Card V1:

## Address
**address** ([StringField](#stringfield)): The address of the person.

```py
print(result.document.inference.prediction.address.value)
```

## Company
**company** ([StringField](#stringfield)): The company the person works for.

```py
print(result.document.inference.prediction.company.value)
```

## Email
**email** ([StringField](#stringfield)): The email address of the person.

```py
print(result.document.inference.prediction.email.value)
```

## Fax Number
**fax_number** ([StringField](#stringfield)): The Fax number of the person.

```py
print(result.document.inference.prediction.fax_number.value)
```

## Firstname
**firstname** ([StringField](#stringfield)): The given name of the person.

```py
print(result.document.inference.prediction.firstname.value)
```

## Job Title
**job_title** ([StringField](#stringfield)): The job title of the person.

```py
print(result.document.inference.prediction.job_title.value)
```

## Lastname
**lastname** ([StringField](#stringfield)): The lastname of the person.

```py
print(result.document.inference.prediction.lastname.value)
```

## Mobile Number
**mobile_number** ([StringField](#stringfield)): The mobile number of the person.

```py
print(result.document.inference.prediction.mobile_number.value)
```

## Phone Number
**phone_number** ([StringField](#stringfield)): The phone number of the person.

```py
print(result.document.inference.prediction.phone_number.value)
```

## Social Media
**social_media** (List[[StringField](#stringfield)]): The social media profiles of the person or company.

```py
for social_media_elem in result.document.inference.prediction.social_media:
print(social_media_elem.value)
```

## Website
**website** ([StringField](#stringfield)): The website of the person or company.

```py
print(result.document.inference.prediction.website.value)
```

# Questions?
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-2d0ds7dtz-DPAF81ZqTy20chsYpQBW5g)
142 changes: 142 additions & 0 deletions docs/extras/guide/delivery_notes_v1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
---
title: Delivery note OCR Python
category: 622b805aaec68102ea7fcbc2
slug: python-delivery-note-ocr
parentDoc: 609808f773b0b90051d839de
---
The Python OCR SDK supports the [Delivery note API](https://platform.mindee.com/mindee/delivery_notes).

Using the [sample below](https://github.com/mindee/client-lib-test-data/blob/main/products/delivery_notes/default_sample.jpg), we are going to illustrate how to extract the data that we want using the OCR SDK.
![Delivery note sample](https://github.com/mindee/client-lib-test-data/blob/main/products/delivery_notes/default_sample.jpg?raw=true)

# Quick-Start
```py
from mindee import Client, product, AsyncPredictResponse

# Init a new client
mindee_client = Client(api_key="my-api-key")

# Load a file from disk
input_doc = mindee_client.source_from_path("/path/to/the/file.ext")

# Load a file from disk and enqueue it.
result: AsyncPredictResponse = mindee_client.enqueue_and_parse(
product.DeliveryNoteV1,
input_doc,
)

# Print a brief summary of the parsed data
print(result.document)

```

**Output (RST):**
```rst
########
Document
########
:Mindee ID: d5ead821-edec-4d31-a69a-cf3998d9a506
:Filename: default_sample.jpg
Inference
#########
:Product: mindee/delivery_notes v1.0
:Rotation applied: Yes
Prediction
==========
:Delivery Date: 2019-10-02
:Delivery Number: INT-001
:Supplier Name: John Smith
:Supplier Address: 4490 Oak Drive, Albany, NY 12210
:Customer Name: Jessie M Horne
:Customer Address: 4312 Wood Road, New York, NY 10031
:Total Amount: 204.75
```

# Field Types
## Standard Fields
These fields are generic and used in several products.

### BaseField
Each prediction object contains a set of fields that inherit from the generic `BaseField` class.
A typical `BaseField` object will have the following attributes:

* **value** (`Union[float, str]`): corresponds to the field value. Can be `None` if no value was extracted.
* **confidence** (`float`): the confidence score of the field prediction.
* **bounding_box** (`[Point, Point, Point, Point]`): contains exactly 4 relative vertices (points) coordinates of a right rectangle containing the field in the document.
* **polygon** (`List[Point]`): contains the relative vertices coordinates (`Point`) of a polygon containing the field in the image.
* **page_id** (`int`): the ID of the page, always `None` when at document-level.
* **reconstructed** (`bool`): indicates whether an object was reconstructed (not extracted as the API gave it).

> **Note:** A `Point` simply refers to a List of two numbers (`[float, float]`).

Aside from the previous attributes, all basic fields have access to a custom `__str__` method that can be used to print their value as a string.


### AmountField
The amount field `AmountField` only has one constraint: its **value** is an `Optional[float]`.

### DateField
Aside from the basic `BaseField` attributes, the date field `DateField` also implements the following:

* **date_object** (`Date`): an accessible representation of the value as a python object. Can be `None`.

### StringField
The text field `StringField` only has one constraint: its **value** is an `Optional[str]`.

# Attributes
The following fields are extracted for Delivery note V1:

## Customer Address
**customer_address** ([StringField](#stringfield)): The address of the customer receiving the goods.

```py
print(result.document.inference.prediction.customer_address.value)
```

## Customer Name
**customer_name** ([StringField](#stringfield)): The name of the customer receiving the goods.

```py
print(result.document.inference.prediction.customer_name.value)
```

## Delivery Date
**delivery_date** ([DateField](#datefield)): The date on which the delivery is scheduled to arrive.

```py
print(result.document.inference.prediction.delivery_date.value)
```

## Delivery Number
**delivery_number** ([StringField](#stringfield)): A unique identifier for the delivery note.

```py
print(result.document.inference.prediction.delivery_number.value)
```

## Supplier Address
**supplier_address** ([StringField](#stringfield)): The address of the supplier providing the goods.

```py
print(result.document.inference.prediction.supplier_address.value)
```

## Supplier Name
**supplier_name** ([StringField](#stringfield)): The name of the supplier providing the goods.

```py
print(result.document.inference.prediction.supplier_name.value)
```

## Total Amount
**total_amount** ([AmountField](#amountfield)): The total monetary value of the goods being delivered.

```py
print(result.document.inference.prediction.total_amount.value)
```

# Questions?
[Join our Slack](https://join.slack.com/t/mindee-community/shared_invite/zt-2d0ds7dtz-DPAF81ZqTy20chsYpQBW5g)
8 changes: 4 additions & 4 deletions docs/extras/guide/financial_document_v1.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,17 +58,17 @@ print(result.document)
########
Document
########
:Mindee ID: 340ee4ae-b4da-41f0-b5ea-81ae29852b57
:Mindee ID: b26161ce-35d0-4984-b1ff-886645e160e6
:Filename: default_sample.jpg
Inference
#########
:Product: mindee/financial_document v1.10
:Product: mindee/financial_document v1.11
:Rotation applied: Yes
Prediction
==========
:Locale: en; en; USD;
:Locale: en-US; en; US; USD;
:Invoice Number: INT-001
:Purchase Order Number: 2412/2019
:Receipt Number:
Expand Down Expand Up @@ -120,7 +120,7 @@ Page Predictions
Page 0
------
:Locale: en; en; USD;
:Locale: en-US; en; US; USD;
:Invoice Number: INT-001
:Purchase Order Number: 2412/2019
:Receipt Number:
Expand Down
Loading

0 comments on commit ea95dfd

Please sign in to comment.