VIDUR  profile photo

GenAI Intern

VIDUR

Gurgaon
Not Disclosed
1 Opening(s)
Posted 4 days ago
Application endsApr 04, 2025

Job Description

Job description Key Responsibilities: • Develop scripts for scraping websites using Python and related frameworks (Scrapy, BeautifulSoup, Selenium, etc.). • Extract, clean, and structure data from long raw PDFs into knowledge graphs. • Work with NLP and text processing libraries (e.g., spaCy, NLTK, PDFMiner, PyMuPDF) to parse legal documents. • Collaborate with data scientists and engineers to integrate extracted data into larger AI models. • Optimize scraping techniques to improve speed, accuracy, and reliability. • Assist in pre-processing and structuring data for machine learning applications. Requirements: • Strong proficiency in Python. • Experience with web scraping (BeautifulSoup, Scrapy, Selenium, or Playwright). • Familiarity with PDF parsing techniques (PDFMiner, PyMuPDF, or similar). • Basic understanding of knowledge graphs and relational data representation. • Knowledge of NLP and text processing is a plus. • Ability to work on-site in Gurugram for the duration of the internship. • Strong problem-solving skills and eagerness to learn.

Job Skills

Python
web scraping

Job Overview

Date Posted
February 18, 2025
Location
Gurgaon, Haryana
Offered Salary

Not disclosed

Expiration date
April 04, 2025
Experience
0 To 3 Years
Qualification
Any bachelor's degree