100% Remote Position
Position Type: Contract
Hadoop/Dataproc Platform Analyst
We are seeking a highly skilled Hadoop/Dataproc Expert to support a strategic initiative focused on the reverse engineering, analysis, and retirement/migration of our existing Hadoop/Google Cloud Dataproc platform. This role is critical to enabling the project team to fully understand the current data ecosystem and prepare for a seamless transition.
Key Responsibilities:
Platform Analysis & Assessment: Conduct a comprehensive review of the Hadoop/Dataproc environment, including cluster configurations, resource usage, and job execution patterns. Identify and document all active components, services, and dependencies.
Data & Metadata Extraction: Pull and catalog all data sources, datasets, and associated metadata across the platform. Map data lineage and relationships to support reverse engineering efforts.
Feed & Log Analysis: Analyze incoming and outgoing data feeds, including batch and streaming pipelines. Review system logs, audit trails, and job histories to identify integration points and data flow patterns.
Collaboration & Documentation: Work closely with data engineers, architects, and project leads to provide insights and technical guidance. Document findings in a structured format to support migration planning and platform decommissioning.
Required Skills & Experience: