Thursday, 13 February 2025

CDK for EMR

 

Step 1: Install Required Software

  1. Install Node.js

    • Download and install Node.js.
    • Verify installation:
      node -v
      
  2. Install AWS CDK
    Open PowerShell as Administrator and run:

    npm install -g aws-cdk
    
    • Verify installation:
      cdk --version
      
  3. Install AWS CLI


Step 2: Set Up a CDK Project

  1. Create a new directory and navigate into it

    mkdir my-emr-cdk
    cd my-emr-cdk
    
  2. Initialize a CDK project (TypeScript)

    cdk init app --language=typescript
    
  3. Install dependencies for EMR

    npm install @aws-cdk/aws-emr @aws-cdk/aws-iam
    

Step 3: Define the EMR Cluster

  • Open lib/my-emr-cdk-stack.ts using a code editor like VS Code or Notepad++.
  • Replace the contents with:
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as emr from 'aws-cdk-lib/aws-emr';
import * as iam from 'aws-cdk-lib/aws-iam';

export class MyEmrCdkStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    // IAM Role for EMR Cluster
    const emrRole = new iam.Role(this, 'EMRClusterRole', {
      assumedBy: new iam.ServicePrincipal('elasticmapreduce.amazonaws.com'),
      managedPolicies: [
        iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AmazonElasticMapReduceRole'),
      ],
    });

    // Define EMR Cluster
    const emrCluster = new emr.CfnCluster(this, 'MyEMRCluster', {
      name: 'MyCDKEMRCluster',
      releaseLabel: 'emr-6.9.0',
      applications: [{ name: 'Hadoop' }, { name: 'Spark' }],
      instances: {
        masterInstanceGroup: {
          instanceType: 'm5.xlarge',
          instanceCount: 1,
        },
        coreInstanceGroup: {
          instanceType: 'm5.xlarge',
          instanceCount: 2,
        },
      },
      jobFlowRole: emrRole.roleArn,
      serviceRole: emrRole.roleArn,
      visibleToAllUsers: true,
    });
  }
}

Step 4: Deploy the CDK Stack

  1. Bootstrap AWS CDK

    cdk bootstrap
    
  2. Synthesize CloudFormation template

    cdk synth
    
  3. Deploy the EMR cluster

    cdk deploy
    

Your EMR cluster should now be created!

No comments:

Post a Comment