GenAI Intern

VIDUR

Gurgaon

Not Disclosed

1 Opening(s)

Posted 4 days ago

Application endsApr 04, 2025

Job Description

Job description Key Responsibilities: • Develop scripts for scraping websites using Python and related frameworks (Scrapy, BeautifulSoup, Selenium, etc.). • Extract, clean, and structure data from long raw PDFs into knowledge graphs. • Work with NLP and text processing libraries (e.g., spaCy, NLTK, PDFMiner, PyMuPDF) to parse legal documents. • Collaborate with data scientists and engineers to integrate extracted data into larger AI models. • Optimize scraping techniques to improve speed, accuracy, and reliability. • Assist in pre-processing and structuring data for machine learning applications. Requirements: • Strong proficiency in Python. • Experience with web scraping (BeautifulSoup, Scrapy, Selenium, or Playwright). • Familiarity with PDF parsing techniques (PDFMiner, PyMuPDF, or similar). • Basic understanding of knowledge graphs and relational data representation. • Knowledge of NLP and text processing is a plus. • Ability to work on-site in Gurugram for the duration of the internship. • Strong problem-solving skills and eagerness to learn.

Job Skills

Python

web scraping

Job Overview

Date Posted

February 18, 2025

Location

Gurgaon, Haryana

Offered Salary

Not disclosed

Expiration date

April 04, 2025

Experience

0 To 3 Years

Qualification

Any bachelor's degree

Popular Internships and Jobs by Categories

Find Internships

General

Browse

Features

Legal

Download the app