Mastering tryCatch() in R: From Basics to Structured Logging

Introduction

Error handling is a fundamental part of writing reliable R code. Whether you’re reading dozens of files, running a loop over thousands of observations, or building a data pipeline, something is bound to go wrong. When it does, tryCatch() allows your code to fail gracefully—without crashing your entire process.

In this post, we’ll start from scratch to understand how tryCatch() works, and build up to a practical and structured approach to logging and recovering from specific types of failures. Our goal is not just to catch errors, but to catch them with context, and handle them in a controlled and traceable way.

library(data.table)

1. A First Look at `tryCatch()`

Let’s begin with a minimal example that handles a simple error:

tryCatch(
  expr = {
    stop("Something went wrong!")
  },
  error = function(e) {
    message("Caught an error: ", e$message)
    return(NA)
  }
)

Caught an error: Something went wrong!

[1] NA

This code tries to run the expression stop("..."), which throws an error. The error = block intercepts that error and prints a message, returning NA instead.

Similarly, we can catch warnings:

tryCatch(
  expr = {
    warning("Something is off...")
    42
  },
  warning = function(w) {
    message("Caught a warning: ", w$message)
    return(-1)
  }
)

Caught a warning: Something is off...

[1] -1

And we can also define a finally block that always runs, even if there’s no error or warning:

tryCatch(
  expr = {
    10
  },
  finally = {
    message("This always runs.")
  }
)

This always runs.

[1] 10

The core idea is this: tryCatch() lets you write error-handling logic in the same place your code runs, using specific condition types (like error, warning, or custom classes).

2. A Practical Example with Recovery Logic

Let’s look at a more realistic scenario. Suppose you have a function that sometimes succeeds, sometimes warns, and sometimes fails, depending on the input.

Note

This example is an adaptation of Jonathanscallahan’s blog Basic Error Handing in R with tryCatch()

my_divide <- function(d, a) {
  if (a == "warning") {
    warning("my_divide warning message")
    return("Warning fallback result")
  } else if (a == "error") {
    stop("my_divide error message")
  } else {
    return(d / as.numeric(a))
  }
}

Now let’s use tryCatch() to run this function and react to each case:

run_example <- function(a) {
  result <- tryCatch({

    b <- 2
    c <- b^2
    d <- c + 2

    if (a == "suppress-warnings") {
      e <- suppressWarnings(my_divide(d, a))
    } else {
      e <- my_divide(d, a)
    }

    f <- e + 100
    f

  }, warning = function(w) {
    message("Caught warning: ", conditionMessage(w))
    e <- my_divide(d, 0.1)
    f <- e + 100
    return(f)

  }, error = function(e) {
    message("Caught error: ", conditionMessage(e))
    e <- my_divide(d, 0.01)
    f <- e + 100
    return(f)

  }, finally = {
    message("a = ", a, "| b = ", b, "| c = ", c, "| d = ", d)
  })

  message("Final result: ", result)
  return(result)
}

Try it with different arguments:

run_example("warning")

Caught warning: my_divide warning message

a = warning| b = 2| c = 4| d = 6

Final result: 160

[1] 160

run_example("error")

Caught error: my_divide error message

a = error| b = 2| c = 4| d = 6

Final result: 700

[1] 700

run_example("2")

a = 2| b = 2| c = 4| d = 6

Final result: 103

[1] 103

run_example("suppress-warnings")

a = suppress-warnings| b = 2| c = 4| d = 6

Final result: NA

[1] NA

This function shows several important ideas:

You can distinguish between warnings and errors.
You can recover differently depending on what went wrong.
The finally block executes even if an error occurred.
Values returned from the handler become the return value of the entire tryCatch() block.

3. Custom Error Classes and Structured Conditions

Catching errors like "something went wrong" is helpful, but not very specific. In a large pipeline, we often want to know:

What type of error occurred
Which function or step caused it
What data triggered the failure

This is where custom error classes and structured metadata become useful.

Let’s say we want to define a special kind of error for when we detect duplicated rows in a dataset:

dt <- data.table(
  country = c("A", "A", "B", "C", "C"),
  year = c(2020, 2020, 2021, 2022, 2022),
  value = c(10, 10, 20, 30, 30),
  id = c("A2020", "A2020", "B2021", "C2022", "C2022")
)

keyVar <- c("country", "year")

We can use cli::cli_abort() or rlang::abort() to raise an error with a custom class (using the class argument) and additional metadata with customed names arguments (i.e., arguments with names that you choose):

check_for_duplicates <- function(dt, keyVar) {
  # Protect against unexpected base errors like missing columns
  tryCatch({
    if (uniqueN(dt, by = keyVar) != nrow(dt)) {
      dup_rows <- dt[duplicated(dt, by = keyVar)]
      n_rep <- nrow(dup_rows)

      cli::cli_abort(
        message = "Found {n_rep} duplicated row(s) in the dataset.",
        class = c("dup_pfw", "validation_error"),
        key = keyVar,
        ids = unique(dup_rows$id)
      )
    }

    return(dt)
  }, error = function(e) {
    # Bubble up unknown errors as-is to be caught by outer tryCatch
    stop(e)
  })
}

This raises a custom error of class "dup_pfw", which we can later catch specifically inside tryCatch():

result <- tryCatch(
  expr = check_for_duplicates(dt, keyVar),
  
  dup_pfw = function(e) {
    message("Caught a known duplication issue.")
    message("Key used: ", paste(e$key, collapse = ", "))
    message("Affected IDs: ", paste(unique(e$ids), collapse = ", "))
    return(NULL)  # or apply a fix and return cleaned data
  },

  error = function(e) {
    message("Caught an unknown error: ", conditionMessage(e))
    return(NULL)
  }
)

Caught a known duplication issue.

Key used: country, year

Affected IDs: A2020, C2022

Why is this better?

We’re not just catching any error — we’re catching expected errors by class.
We can extract metadata (e$key, e$ids) to include in a log or a diagnostic message.
If the error is not of class "dup_pfw", it’s passed on to the more general error handler.

This sets us up for a robust logging system: when a known issue occurs, we log it with context; if it’s unknown, we escalate it or halt.

4. Building a Lightweight Logging System

When your code is running on hundreds of datasets or in a production pipeline, you don’t just want to know that something failed—you want a record of what failed, where, and why.

Let’s create a simple logging function that writes errors to a file:

add_log <- function(cnd, logfile) {
  timestamp <- format(Sys.time(), "%Y-%m-%d %H:%M:%S")

  cat(
    "[", timestamp, "] ",
    "[", class(cnd)[1], "] ",
    cnd$message,
    if (!is.null(cnd$ids)) {
      paste(" | IDs:", paste(cnd$ids, collapse = ", "))
    },
    "\n",
    file = logfile,
    append = TRUE, 
    sep = ""
  )
}

This function takes any condition object and appends a formatted line to a plain-text file. It includes:

The time of the error
The class of the error
The message
Optionally, any custom metadata (like IDs or keys)

Now let’s use it inside a `tryCatch()` block:

logfile <- tempfile(fileext = ".txt")

handle_duplicates <- function(dt, keyVar, logfile) {
  tryCatch(
    expr = check_for_duplicates(dt, keyVar),

    dup_pfw = function(e) {
      add_log(e, logfile = logfile)  # log the known duplication error
      dt_clean <- unique(dt, by = keyVar)  # remove duplicates
      return(dt_clean)
    },

    error = function(e) {
      add_log(e, logfile = logfile)  # log unknown errors too
      message("Unexpected error occurred. See log for details.")
      return(NULL)  # <-- graceful exit without re-raising the error
      # Or, if you need the function to fail, you could do this:
      # stop("Unexpected error occurred. See log for details.")
    }
  )
}


handle_duplicates(dt, keyVar, logfile)

   country  year value     id
    <char> <num> <num> <char>
1:       A  2020    10  A2020
2:       B  2021    20  B2021
3:       C  2022    30  C2022

handle_duplicates(dt, "wrongVar", logfile)

Unexpected error occurred. See log for details.

NULL

This version of handle_duplicates() does three things:

Tries to detect duplicate rows.
If duplicates are found, it logs the error and removes them.
If an unknown error occurs, it logs that too—and stops execution.

This pattern is clean, recoverable, and extensible. You can add new condition classes for things like missing values, invalid columns, or range checks, and respond to each one differently.

Reading and Summarizing the Log File

After logging multiple errors during a data-cleaning process, you may want to quickly inspect what went wrong. Here’s how to read the log and get a summary of known vs. unknown errors.

Let’s assume we’ve used add_log() (from the previous section) to write to a file called log.txt.

# Preview the raw log
readLines(logfile)

[1] "[2025-05-14 19:32:41] [dup_pfw] Found 2 duplicated row(s) in the dataset. | IDs: A2020, C2022"                      
[2] "[2025-05-14 19:32:41] [simpleError] argument specifying columns received non-existing column(s): cols[1]='wrongVar'"

Create a helper to summarize logs by type

We can write a small function (summarize_log()) to parse the log and report how many known and unknown errors occurred:

Code

summarize_log <- function(logfile) {
  if (!file.exists(logfile)) {
    message("No log file found.")
    return(invisible(NULL))
  }

  lines <- readLines(logfile)

  known <- grep("\\[dup_pfw\\]", lines, value = TRUE)
  unknown <- grep("\\[(simpleError|error)\\]", lines, value = TRUE)

  cat("== Log Summary ==\n")
  cat("Total entries:", length(lines), "\n")
  cat("Known 'dup_pfw' errors:", length(known), "\n")
  cat("Other (unknown) errors:", length(unknown), "\n")

  if (length(known)) {
    cat("\nKnown issues:\n")
    cat(paste("-", known), sep = "\n")
  }

  if (length(unknown)) {
    cat("\nUnknown issues:\n")
    cat(paste("-", unknown), sep = "\n")
  }
}

#Now call it like this:
summarize_log(logfile)

== Log Summary ==
Total entries: 2 
Known 'dup_pfw' errors: 1 
Other (unknown) errors: 1 

Known issues:
- [2025-05-14 19:32:41] [dup_pfw] Found 2 duplicated row(s) in the dataset. | IDs: A2020, C2022

Unknown issues:
- [2025-05-14 19:32:41] [simpleError] argument specifying columns received non-existing column(s): cols[1]='wrongVar'

This small addition turns your log from a flat text dump into a searchable, inspectable tool — one that could be expanded later to generate HTML reports, markdown diagnostics, or summaries across datasets.

5. Robust example with cleaning , logging, and `skip_err` Logic

Now that we know how to catch and log structured and unexpected errors, let’s implement a more flexible function with real-world behavior:

It logs known issues (dup_pfw class).
It can skip or stop depending on a skip_err flag.
It logs any unexpected error.

Function: `clean_duplicates()`

Pay careful attention to this function because it has many important details and applied concepts.

clean_duplicates <- function(dt, keyVar, logfile, skip_err = TRUE) {
  tryCatch(
    expr = {
      if (uniqueN(dt, by = keyVar) != nrow(dt)) {
        dup_rows <- dt[duplicated(dt, by = keyVar)]
        n_rep <- nrow(dup_rows)

        cli::cli_abort(
          message = "Found {n_rep} duplicated row{?s}.",
          class = c("dup_pfw", "validation_error"),
          key = keyVar,
          ids = unique(dup_rows$id),
          skip = skip_err
        )
      }

      return(dt)
    },

    dup_pfw = function(e) {
      add_log(e, logfile)

      if (!isTRUE(e$skip)) {
        stop("Duplicate data found and skip_err = FALSE. See log.")
      }

      # else skip and return cleaned data
      dt_clean <- unique(dt, by = keyVar)
      return(dt_clean)
    },

    error = function(e) {
      add_log(e, logfile)
      if (!skip_err) {
        stop("Unknown error. Halting execution. See log.")
      } else {
        message("Unknown error. Skipping due to skip_err = TRUE.")
        return(NULL)
      }
    }
  )
}

Example Usage

Create a dataset with duplicates:

dt <- data.table::data.table(
  country = c("A", "A", "B"),
  year = c(2020, 2020, 2021),
  value = c(10, 10, 20),
  id = c("A2020", "A2020", "B2021")
)

keyVar <- c("country", "year")
logfile <- tempfile(fileext = ".txt")

Run it with the default skip_err = TRUE:

result <- clean_duplicates(dt, keyVar, logfile)
# THis is the data cleaned, but the dulplicates
# should be reported in logfile
result

   country  year value     id
    <char> <num> <num> <char>
1:       A  2020    10  A2020
2:       B  2021    20  B2021

Then test the behavior when skip_err = FALSE:

# this will create a similar entry in the log as the one above
# since it is executed in the same second, I am adding a second to show that they 
# are different
Sys.sleep(1)
clean_duplicates(dt, keyVar, logfile, skip_err = FALSE)

Error in value[[3L]](cond): Unknown error. Halting execution. See log.

Try again with a bad key:

clean_duplicates(dt, "bad_column", logfile)

Unknown error. Skipping due to skip_err = TRUE.

NULL

Read the log:

# using helper from earlier to get a quick summary.
summarize_log(logfile)

== Log Summary ==
Total entries: 4 
Known 'dup_pfw' errors: 2 
Other (unknown) errors: 2 

Known issues:
- [2025-05-14 19:32:41] [dup_pfw] Found 1 duplicated row. | IDs: A2020
- [2025-05-14 19:32:42] [dup_pfw] Found 1 duplicated row. | IDs: A2020

Unknown issues:
- [2025-05-14 19:32:42] [simpleError] Duplicate data found and skip_err = FALSE. See log.
- [2025-05-14 19:32:42] [simpleError] argument specifying columns received non-existing column(s): cols[1]='bad_column'

# Or use  `readLines(logfile)`

Summary

In this post, we learned how to use tryCatch() in R for robust error handling and logging in data workflows. The post began by introducing the basic structure and mechanics of tryCatch(), including how to catch warnings, errors, and use a finally block for cleanup.

We then explored a practical example of a function (my_divide) that behaves differently based on input, showing how tryCatch() can handle known problems and apply recovery logic.

Next, we introduced the concept of custom error classes using cli::cli_abort(), allowing for structured, class-based error signaling. By assigning custom classes (e.g., "dup_pfw") and attaching metadata (e.g., keys and identifiers), we created a mechanism for precise error identification and downstream handling.

Building on this, we developed a lightweight logging function (add_log()) that writes structured error information to a file. This allowed for the creation of reproducible logs that can be summarized and inspected, enabling transparency and traceability in pipeline failures.

To inspect the logs, we created a summarize_log() function that parses the log file and distinguishes between known errors (like duplicate rows) and unknown errors (e.g., column not found), providing an accessible summary of all logged issues.

Finally, we consolidated these techniques into a general-purpose function (clean_duplicates) that checks for duplicates, logs failures, and decides whether to continue or stop based on a skip_err flag. This function demonstrated how to handle both expected and unexpected errors, clean the data if possible, and capture complete information in the log file.

Together, these components form a modular and extensible framework for structured error handling, recovery, and logging in R, especially suitable for automated and large-scale data processing tasks.

Introduction

1. A First Look at tryCatch()

2. A Practical Example with Recovery Logic

3. Custom Error Classes and Structured Conditions

Why is this better?

4. Building a Lightweight Logging System

Now let’s use it inside a tryCatch() block:

Reading and Summarizing the Log File

Create a helper to summarize logs by type

5. Robust example with cleaning , logging, and skip_err Logic

Function: clean_duplicates()

Example Usage

Read the log:

Summary

1. A First Look at `tryCatch()`

Now let’s use it inside a `tryCatch()` block:

5. Robust example with cleaning , logging, and `skip_err` Logic

Function: `clean_duplicates()`