Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FDA NDC package equivalence (discrete packaging information) #8

Open
Bridg109 opened this issue Oct 26, 2021 · 5 comments
Open

FDA NDC package equivalence (discrete packaging information) #8

Bridg109 opened this issue Oct 26, 2021 · 5 comments
Assignees

Comments

@Bridg109
Copy link
Collaborator

Proposal

Parse the fda_ndc package field into a package equivalency table

Rationale

adds digestible package information for people to use tied to NDCs

@jrlegrand
Copy link
Member

@Bridg109 - did you get very far with this? Is this just regexing the fda package information field to get total count of unit per outer NDC?

@Bridg109
Copy link
Collaborator Author

Bridg109 commented May 22, 2022 via email

@jrlegrand jrlegrand assigned Bridg109 and jrlegrand and unassigned Bridg109 and jrlegrand Aug 25, 2022
@jrlegrand jrlegrand changed the title fda_ndc_package_equivalance FDA NDC package equivalence (discrete packaging information) Jan 23, 2024
@jrlegrand jrlegrand self-assigned this May 2, 2024
@jrlegrand
Copy link
Member

See some analysis and notes on this page: https://coderx.io/sagerx/source-data/dailymed

See branch I associated with this issue (jrlegrand/pack-size).

I've noticed that "parts" (i.e. kit parts) tend to be the final part of a description.

Also - parts can have "sub-parts"(?) like this: * 1 SYRINGE in 1 TRAY / .4 mL in 1 SYRINGE (0025-0329-01)

I'm thinking that we have 3(?) layers - maybe more:

Example:
2 KIT in 1 CARTON (0025-0333-02) / 1 KIT in 1 KIT * 1 SYRINGE in 1 TRAY / .4 mL in 1 SYRINGE (0025-0329-01) * 1 mL in 1 PACKET (0069-1111-01)

  1. packaging components
    1. 2 KIT in 1 CARTON (0025-0333-02)
    2. 1 KIT in 1 KIT * 1 SYRINGE in 1 TRAY / .4 mL in 1 SYRINGE (0025-0329-01) * 1 mL in 1 PACKET (0069-1111-01)
  2. packaging parts (both relate to component ii)
      • 1 SYRINGE in 1 TRAY / .4 mL in 1 SYRINGE (0025-0329-01)
      • 1 mL in 1 PACKET (0069-1111-01)
  3. packaging subparts (relates to part i)
    1. / .4 mL in 1 SYRINGE (0025-0329-01)

Another way of visualizing this:

  • 2 KIT in 1 CARTON (0025-0333-02)
    • / 1 KIT in 1 KIT
      • * 1 SYRINGE in 1 TRAY
        • / .4 mL in 1 SYRINGE (0025-0329-01)
      • * 1 mL in 1 PACKET (0069-1111-01)

@jrlegrand
Copy link
Member

jrlegrand commented Nov 7, 2024

Highest "slash count" of a non-KIT packagedescription is 4

NOTE: these also look like something is wrong with them. Likely someone filled out the form incorrectly. There's only two examples with 4 slashes.

4-slash examples

500 BOTTLE in 1 BOTTLE (71309-003-05) / 1000 BOTTLE in 1 BOTTLE (71309-003-01) / 365 BOTTLE in 1 BOTTLE (71309-003-65) / 30 CARTON in 1 BOTTLE (71309-003-30) / 1 TABLET, DELAYED RELEASE in 1 CARTON

90 BOTTLE in 1 BOTTLE (61919-229-90) / 120 BOTTLE in 1 BOTTLE (61919-229-72) / 60 BOTTLE in 1 BOTTLE (61919-229-60) / 50 BOTTLE in 1 BOTTLE (61919-229-50) / 30 TABLET in 1 BOTTLE (61919-229-30)

When you get to 3, it seems much more normal, and there's only about 80 examples with 3 slashes. The rest of the 220k+ NDCs are 2 or less slashes.

3-slash examples

12 PACKAGE in 1 CARTON (71349-070-12) / 1 APPLICATOR in 1 PACKAGE (71349-070-01) / 1 AMPULE in 1 APPLICATOR / .45 mL in 1 AMPULE

2 BOX in 1 CARTON (73362-102-01) / 1 SYRINGE in 1 BOX (73362-102-02) / 1 CONTAINER in 1 SYRINGE (73362-102-03) / 1.4 mL in 1 CONTAINER

10 POUCH in 1 CASE (51662-1208-3) / 1 CARTON in 1 POUCH (51662-1208-2) / 1 SYRINGE, GLASS in 1 CARTON / 5 mL in 1 SYRINGE, GLASS

2-slash examples

10 BOX in 1 CASE (68599-5804-6) / 100 PACKET in 1 BOX (68599-5804-5) / 1 mL in 1 PACKET (68599-5804-4)

10 CARTON in 1 CARTON (70069-008-10) / 1 BOTTLE in 1 CARTON (70069-008-01) / 5 mL in 1 BOTTLE

3 TRAY in 1 CASE (68094-061-62) / 10 CUP, UNIT-DOSE in 1 TRAY / 5 mL in 1 CUP, UNIT-DOSE (68094-061-59)

56 POUCH in 1 CARTON (68180-962-56) / 4 AMPULE in 1 POUCH / 5 mL in 1 AMPULE

1-slash examples

1 BOTTLE in 1 CARTON (16714-898-01) / 30 TABLET in 1 BOTTLE

1 TUBE in 1 CARTON (24338-185-04) / 117 g in 1 TUBE

1 BOTTLE in 1 CARTON (24385-531-26) / 120 mL in 1 BOTTLE

0-slash examples

30 TABLET in 1 CONTAINER (33342-115-07)

1000 TABLET, FILM COATED in 1 BOTTLE (33342-192-44)

60 mL in 1 BOTTLE, DROPPER (48951-8221-3)

@jrlegrand
Copy link
Member

jrlegrand commented Nov 7, 2024

Proofs

descriptions that have asterisks (parts) ALWAYS have a KIT component
```sql
with 

all_fda as (
	select ndc11, packagedescription
	from sagerx_dev.stg_fda_ndc__ndcs

	union all

	select ndc11, packagedescription
	from sagerx_dev.stg_fda_excluded__ndcs

	union all

	select ndc11, packagedescription
	from sagerx_dev.stg_fda_unfinished__ndcs
)

, fda_analysis as (
	select
		ndc11, 
		length(packagedescription) as str_length
		, packagedescription like('%*%') as asterisk_tf
		, packagedescription like('% KIT %') as kit_tf
		, packagedescription
	from all_fda
)

select count(*) from fda_analysis
where asterisk_tf = true
and kit_tf = false
	
// result: 0
```
descriptions that have a KIT component don’t always have asterisks (parts) ``` with
all_fda as (
	select ndc11, packagedescription
	from sagerx_dev.stg_fda_ndc__ndcs

	union all

	select ndc11, packagedescription
	from sagerx_dev.stg_fda_excluded__ndcs

	union all

	select ndc11, packagedescription
	from sagerx_dev.stg_fda_unfinished__ndcs
)

, fda_analysis as (
	select
		ndc11, 
		length(packagedescription) as str_length
		, packagedescription like('%*%') as asterisk_tf
		, packagedescription like('% KIT %') as kit_tf
		, packagedescription
	from all_fda
)

select count(*) from fda_analysis
where asterisk_tf = false
and kit_tf = true
	
// result: 986
```
there are 2863 KIT-containing descriptions (out of 220k+) and of those, 2114 have an asterisk ``` with

all_fda as (
select ndc11, packagedescription
from sagerx_dev.stg_fda_ndc__ndcs

)

, fda_analysis as (
select
ndc11,
length(packagedescription) as str_length
, case when packagedescription like('%*%') then 1 else 0 end as asterisk_tf
, case when packagedescription like('% KIT %') then 1 else 0 end as kit_tf
, array_length(string_to_array(packagedescription, '/'), 1) - 1 as slash_count
, packagedescription
from all_fda
)

select
sum(asterisk_tf) as total_asterisk,
sum(kit_tf) as total_kit

from fda_analysis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants