pyspark.sql.functions.min_by#
- pyspark.sql.functions.min_by(col, ord)[source]#
Returns the value from the col parameter that is associated with the minimum value from the ord parameter. This function is often used to find the col parameter value corresponding to the minimum ord parameter value within each group when used with groupBy().
New in version 3.3.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- Returns
Column
Column object that represents the value from col associated with the minimum value from ord.
Examples
Example 1: Using min_by with groupBy:
>>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([ ... ("Java", 2012, 20000), ("dotNET", 2012, 5000), ... ("dotNET", 2013, 48000), ("Java", 2013, 30000)], ... schema=("course", "year", "earnings")) >>> df.groupby("course").agg(sf.min_by("year", "earnings")).sort("course").show() +------+----------------------+ |course|min_by(year, earnings)| +------+----------------------+ | Java| 2012| |dotNET| 2012| +------+----------------------+
Example 2: Using min_by with different data types:
>>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([ ... ("Marketing", "Anna", 4), ("IT", "Bob", 2), ... ("IT", "Charlie", 3), ("Marketing", "David", 1)], ... schema=("department", "name", "years_in_dept")) >>> df.groupby("department").agg( ... sf.min_by("name", "years_in_dept") ... ).sort("department").show() +----------+---------------------------+ |department|min_by(name, years_in_dept)| +----------+---------------------------+ | IT| Bob| | Marketing| David| +----------+---------------------------+
Example 3: Using min_by where ord has multiple minimum values:
>>> import pyspark.sql.functions as sf >>> df = spark.createDataFrame([ ... ("Consult", "Eva", 6), ("Finance", "Frank", 5), ... ("Finance", "George", 9), ("Consult", "Henry", 7)], ... schema=("department", "name", "years_in_dept")) >>> df.groupby("department").agg( ... sf.min_by("name", "years_in_dept") ... ).sort("department").show() +----------+---------------------------+ |department|min_by(name, years_in_dept)| +----------+---------------------------+ | Consult| Eva| | Finance| Frank| +----------+---------------------------+