Join two dataframes



    The code calculates the user count and user activity count. It uses the readUserData and readUserActivityData functions to get the data needed. The code then uses a join to combine the data from the two dataframes.

    def calculate(sparkSession: SparkSession): Unit = {
      val UserIdColName = "userId"
      val UserNameColName = "userName"
      val CountColName = "totalEventCount"
      val userRdd: DataFrame = readUserData(sparkSession)
      val userActivityRdd: DataFrame = readUserActivityData(sparkSession)
      # DF1 "readUserData": userId, userName
      # DF2 "readUserActivityData": userId, pageId, timestamp, eventType
      # Create the code to join the two dataframes and count the number of events per userName. 
      # It should output in the format userName; totalEventCount and only for users that have events.
      # ......
    Codiga Logo
    Codiga Hub
    • Rulesets
    • Playground
    • Snippets
    • Cookbooks
    soc-2 icon

    We are SOC-2 Compliance Certified

    G2 high performer medal

    Codiga – All rights reserved 2022.