Key data points for analysis

Data Dictionary The State Drug Utilization Data (SDUD) is a dataset provided by CMS (Centers for Medicare & Medicaid Services), which contains prescription drug usage and spending data for Medicaid beneficiaries. Here's a breakdown of all the common columns in the SDUD file and what they mean:

Column Name	Description
`State`	The U.S. state or territory that submitted the data. Example: `CA`, `TX`.
`Labeler Code`	The first segment of the NDC (National Drug Code) – identifies the manufacturer.
`Product Code`	The second segment of the NDC – identifies the specific product.
`Package Size`	The third segment of the NDC – identifies package size/type. For Eg, Drug A could come 30 pills in one bottle.
`Year`	Calendar year of the data. Example: `2023`.
`Quarter`	Quarter of the year (1 to 4). Example: `Q1`, `Q2`.
`Product Name`	Brand or generic name of the drug. Example: `LISINOPRIL`, `AMOXICILLIN`.
`Suppression Used`	Indicates if data suppression was applied (due to small counts or privacy). `Y` or `N`.
`Units Reimbursed (Number)`	Total quantity of units dispensed and reimbursed by Medicaid.
`Number of Prescriptions`	Total number of prescription fills during the period.
`Total Amount Reimbursed`	Total dollars reimbursed by Medicaid for that drug (includes state + federal).
`Medicaid Amount Reimbursed`	Portion of reimbursement paid by Medicaid program.
`Non-Medicaid Amount Reimbursed`	Portion of reimbursement not paid by Medicaid, e.g., copays or third parties.
Utilization_type	Refer to Medicaid program types under which a prescription was reimbursed.

There are 14932 unique medications which are being reimbursed by the medicaid program.

FFSU – Fee-For-Service Utilization

This represents traditional Medicaid.
Medicaid directly reimburses providers for each service/drug.
States pay providers per prescription or visit.
Usually used when patients are not enrolled in managed care.

MCOU – Managed Care Organization Utilization

This represents Medicaid Managed Care.
The state pays a capitated payment to a Managed Care Organization (MCO).
The MCO then manages patient care, including prescriptions.
These are indirect reimbursements recorded from the MCOs back to the state.

suppression_used = FALSE is only needed because where supression_used is true that data is redacted by the govt due to HIPPA regulations so in those rows the totalamount and units are null. So in our analysis we won’t take them into consideration.