Our company:
Techvantage Analytics is a fast-growing AI services and product engineering company specializing in Analytics, Machine Learning, and AI-based solutions. We are seeking an ambitious and energetic Intern Data Scientist who is passionate about exploring the newest advancements in Generative AI and data science. This role provides an excellent opportunity to work on transformative projects, collaborate with talented professionals, and build a strong foundation in AI-powered solutions.
What we are looking from an ideal candidate?
- Data Exploration and Analysis: Dive deep into raw data to frame insightful questions, identify trends, and deliver actionable solutions.
- Generative AI Development: Design, fine-tune, and deploy Generative AI models like GPT, Stable Diffusion, and DALL·E for innovative applications, including content generation, AI-powered recommendations, and creative problem-solving.
- Large Language Models (LLMs): Build and optimize LLMs, leveraging frameworks like LangChain to enhance AI capabilities for conversational agents and other real-world applications.
- Model Development: Develop predictive models and classifiers using mathematical modeling, machine learning, and statistical techniques.
- Data Preprocessing: Perform data cleansing, transformation, and preparation of structured and unstructured datasets to uncover patterns and enable advanced analytics.
- AI Integration: Work on integrating Generative AI and advanced AI techniques into scalable systems, ensuring impactful solutions for customer challenges.
- Data Visualization: Create interactive dashboards and visualizations using tools like Tableau, Power BI, or Python libraries (e.g., Seaborn, Plotly) to communicate insights effectively.
- Advanced Applications: Apply computer vision techniques such as preprocessing, feature extraction, and pattern recognition, and experiment with NLP and image generation methodologies.
- Collaborative Development: Work closely with software engineers and architects to extract, transform, and standardize data for analytics and machine learning workflows.
- MLOps Practices: Support the creation and maintenance of automation pipelines for model training, deployment, and monitoring using tools like MLflow, Docker, and Kubernetes.
- Innovation and Strategy: Continuously explore advancements in Generative AI and propose innovative strategies to address complex business problems.
Preferred Skills:
What skills do you need?
- B.Tech/MS/M.Tech or PhD in Computer Science, Machine Learning, AI, or related field.
- Hands-on experience or familiarity with Generative AI models like GPT, DALL·E, MidJourney, and tools such as Hugging Face and OpenAI APIs.
- Exposure to Large Language Models (e.g., BERT, T5) and tools like LangChain, RAG pipelines, or similar frameworks.
- Understanding of supervised, unsupervised, and reinforcement learning, and expertise in algorithms like XGBoost, random forests, and deep learning models.
- Experience with tokenization, embeddings, sentiment analysis, and sequence-to-sequence models.
- Proficiency in ETL processes, data pipelines, and experience with Big Data tools like Apache Spark, Kafka, or Hadoop.
- Strong knowledge of Tableau, Power BI, or Python visualization libraries (Seaborn, Matplotlib, Plotly).
- Familiarity with MLOps tools like MLflow, Kubeflow, Docker, Kubernetes, Jenkins, and CI/CD pipelines.
- Experience with cloud environments like AWS (SageMaker), Azure (Machine Learning Studio), or Google Cloud (Vertex AI).
- Solid foundation in linear algebra, calculus, probability, and statistics, with an ability to apply these concepts to AI models.
- Strong skills in SQL for relational databases and familiarity with NoSQL systems (e.g., MongoDB, Neo4j).
- Knowledge of transformer-based models and their application in NLP, vision, and multimodal tasks.
- Experience with preprocessing, feature extraction, pattern recognition, and frameworks like OpenCV or YOLO.
- Ability to review academic papers and implement cutting-edge techniques.
- Strong analytical thinking, problem-solving, teamwork, and communication skills.
- Exposure to Generative Adversarial Networks (GANs), VAEs, and similar frameworks.
- Contributions to open-source projects or personal AI portfolios on platforms like GitHub.
- Awareness of emerging AI ethics and responsible AI practices.
- Knowledge of time-series analysis, anomaly detection, or geospatial data processing.
- Experience with hybrid or federated learning approaches.
Note: Candidates without hands-on experience or practical exposure to the latest AI trends and technologies are encouraged not to apply