Pig is an open-source dataflow system that allows users to analyze large datasets through a high-level language called Pig Latin. It sits on top of Hadoop and compiles Pig Latin queries into MapReduce jobs. Pig provides simple operations for data manipulation like filtering, grouping, joining, and generating new columns. It is commonly used by companies like Yahoo, Twitter, and LinkedIn to process web logs, build user behavior models, and perform other large-scale data analysis tasks.