Let us say you have a source table for loans, and you need to load it into a stage table in a Data Warehouse (DWH) matching a given schema.
Source Table
| Loan ID | Borrower Name | Loan Amount | Loan Status | Address | Phone |
|---|
| LOAN001 | Alice Smith | 10,000 | Approved | 123 Main St | 123-456-7890 |
| LOAN002 | Bob Johnson | 15,000 | Pending | 456 Elm St | 987-654-3210 |
| LOAN003 | Carol Lee | 20,000 | Defaulted | 789 Oak St | 456-789-1230 |
Stage Table Schema Requirements
The stage table schema specifies the required columns as Loan ID, Loan Amount, and Loan Status. Any additional columns in the source dataset must be removed.
- SQL Query:
CREATE TABLE stage_table AS SELECT Loan_ID, Loan_Amount, Loan_Status FROM source_table;
- Python code:
filtered_data = source_data[[‘Loan ID’, ‘Loan Amount’, ‘Loan Status’]]
Filtered Dataset
| Loan ID | Loan Amount | Loan Status |
|---|
| LOAN001 | 10,000 | Approved |
| LOAN002 | 15,000 | Pending |
| LOAN003 | 20,000 | Defaulted |