pyspark.sql.functions.regexp_instr#
- pyspark.sql.functions.regexp_instr(str, regexp, idx=None)[source]#
Extract all strings in the str that match the Java regex regexp and corresponding to the regex group index.
New in version 3.5.0.
- Parameters
- Returns
Column
all strings in the str that match a Java regex and corresponding to the regex group index.
Examples
>>> df = spark.createDataFrame([("1a 2b 14m", r"\d+(a|b|m)")], ["str", "regexp"]) >>> df.select(regexp_instr('str', lit(r'\d+(a|b|m)')).alias('d')).collect() [Row(d=1)] >>> df.select(regexp_instr('str', lit(r'\d+(a|b|m)'), 1).alias('d')).collect() [Row(d=1)] >>> df.select(regexp_instr('str', lit(r'\d+(a|b|m)'), 2).alias('d')).collect() [Row(d=1)] >>> df.select(regexp_instr('str', col("regexp")).alias('d')).collect() [Row(d=1)]