Verifying downloads in automation testing is often overlooked, yet it's crucial to ensure our files are delivered correctly every time. In this post, we'll dive into practical strategies and code examples to automate file verification, helping us catch download failures before they impact our users. Say goodbye to unreliable downloads and hello to robust, automated file validation!
Download verification is a critical part of automation testing, but it's often overlooked or considered too challenging to implement effectively. Unlike testing UI interactions or form submissions, verifying downloads involves working outside the browser's scope, dealing with file systems, and handling asynchronous processes, which can complicate test automation. Here are some common challenges that make download verification tricky:
These challenges highlight why download verification is often sidelined during test automation. However, overcoming these obstacles with the right strategies, tools, and best practices ensures that our downloads are tested thoroughly and reliably, making our automation suite more robust and dependable.
Setting up our testing environment properly is essential for effective download verification. Let's look at a comparison of download directory configuration in Selenium and Playwright, including the advantages of using a custom download path.
1. Selenium (Python):
In Selenium, we can customize the download directory using browser options. This setup directs the browser to save downloaded files to a specific location on our machine.
from selenium import webdriver
from selenium.webdriver.common.by import By
options = webdriver.ChromeOptions()
prefs = {"download.default_directory": "/path/to/downloads"}
options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(options=options)
driver.get("https://www.test.com")
download_button = driver.find_element(By.ID, "download-button")
download_button.click()
Benefits:
2. Playwright (JavaScript):
Playwright allows us to handle downloads seamlessly by using its built-in event handling for downloads and specifying a custom directory for saving files. The following example demonstrates how to save a file to a specific path when a download is triggered.
const { test } = require("@playwright/test");
test("Download file to specified path", async ({ page }) => {
await page.goto("https://www.test.com");
const downloadPromise = page.waitForEvent("download");
await page.getByText("Download Sample File").click();
const download = await downloadPromise;
await download.saveAs("/path/to/downloads/" + download.suggestedFilename());
});
Also, we can specify where to persist downloaded files using the downloadsPath option in browserType.launch().
Benefits:
Setting a custom download directory in our testing environment offers several key advantages:
Configuring our download directory is a foundational step that simplifies the entire process of download verification, allowing us to build more effective and dependable automation tests.
Simulating a user-triggered download is an essential part of verifying download functionality in automated tests. Downloads can be initiated in various ways, such as clicking a link, submitting a form, or interacting with a button. Ensuring the right approach to triggering downloads in automation tests is crucial to accurately simulate real-world user behavior.
Various Download Initiation Methods
One of the simplest ways to trigger a download is by clicking a link. This is commonly used when users download files like PDFs or images from a webpage. The automation framework clicks the link, and the download is triggered in the background.
Example (Selenium - Python):
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get("https://www.test.com")
download_link = driver.find_element(By.LINK_TEXT, "Click to Download")
download_link.click()
Example (Playwright - JavaScript):
await page.goto("https://www.test.com");
const downloadPromise = page.waitForEvent("download");
await page.click(`text="Click to Download"`);
const download = await downloadPromise;
await download.saveAs("/path/to/downloads/" + download.suggestedFilename());
Best Practices:
In some cases, downloads are triggered by submitting a form, such as when a user fills out information and downloads a receipt or invoice. In this scenario, the form submission is automated, and the download is triggered once the form is processed.
Example (Selenium - Python):
driver = webdriver.Chrome()
driver.get("https://www.test.com")
first_name = driver.find_element(By.NAME, "first-name")
first_name.send_keys("John")
last_name = driver.find_element(By.NAME, "last-name")
last_name.send_keys("Doe")
submit_button = driver.find_element(By.ID, "submit-button")
submit_button.click()
Example (Playwright - JavaScript):
await page.goto("https://www.test.com");
await page.fill('input[name="first-name"]', "John");
await page.fill('input[name="last-name"]', "Doe");
const downloadPromise = page.waitForEvent("download");
await page.click('button[type="submit"]');
const download = await downloadPromise;
await download.saveAs("/path/to/downloads/" + download.suggestedFilename());
Best Practices:
Many web applications have buttons that directly trigger a download (e.g., "Download Report" or "Export CSV"). Simulating this action is straightforward with automation tools by clicking the button and handling the download event, as we have already seen in the setup section.
Example (Selenium - Python):
driver = webdriver.Chrome()
driver.get("https://www.test.com")
download_button = driver.find_element(By.ID, "download-button")
download_button.click()
Example (Playwright - JavaScript):
await page.goto("https://www.test.com");
const downloadPromise = page.waitForEvent("download");
await page.click("#download-button");
const download = await downloadPromise;
await download.saveAs("/path/to/downloads/" + download.suggestedFilename());
Best Practices:
Once a download is triggered, the next critical step is to handle the download properly and ensure that the file is fully downloaded before proceeding with further test steps. Different automation tools offer various methods to manage and wait for downloads to complete. This section covers how to handle downloads in Selenium by polling the directory and in Playwright using its built-in download handling methods. We'll also discuss managing timing issues and ensuring the file is downloaded successfully.
Selenium doesn't have built-in support for download handling, so a common approach is to poll the download directory and wait until the file appears. We need to monitor the directory for the expected file, then proceed once the file is detected.
import os
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
options = webdriver.ChromeOptions()
download_directory = "/path/to/downloads/"
prefs = {"download.default_directory": download_directory}
options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(options=options)
driver.get("https://www.test.com")
download_button = driver.find_element(By.ID, "download-button")
download_button.click()
file_name = "sample-file.txt"
file_path = os.path.join(download_directory, file_name)
timeout = 30 # Maximum wait time in seconds
poll_interval = 1 # Time between checks in seconds
elapsed_time = 0
while not os.path.exists(file_path):
time.sleep(poll_interval)
elapsed_time += poll_interval
if elapsed_time > timeout:
raise Exception(f"Download timed out. File not found: {file_name}")
print(f"File downloaded successfully: {file_name}")
driver.quit()
Tips for Managing Timing Issues in Selenium:
Unlike Selenium, Playwright has built-in support for handling downloads, making it much easier to manage the download process. As we have already seen, Playwright can capture download events and wait for the file to be completely downloaded before saving it to a specific path.
const downloadPromise = page.waitForEvent("download");
await page.click("#download-button");
const download = await downloadPromise;
const filePath ="/path/to/downloads/" + download.suggestedFilename();
await download.saveAs(filePath);
How Playwright Manages Downloads:
Once the download process is complete, the next crucial step in download testing is to verify the integrity of the downloaded files. This involves ensuring that the file has the correct name and type, verifying its content, and performing checks to ensure the file isn't empty or corrupted.
Ensuring the downloaded file has the correct name and extension is the first step in verifying that the download worked as expected. Automation scripts should compare the actual file name and extension against what is expected.
Example (Selenium - Python):
import os
expected_file_name = "sample-file.txt"
download_directory = "/path/to/downloads"
file_path = os.path.join(download_directory, expected_file_name)
if os.path.exists(file_path) and file_path.endswith('.txt'):
print(f"File name and type verification passed: {expected_file_name}")
else:
raise Exception(f"File name or type verification failed: {file_path}")
Example (Playwright - JavaScript):
const fs = require("fs");
const path = require("path");
const expectedFileName = "sample-file.txt";
const downloadDirectory = "/path/to/downloads";
const filePath = path.join(downloadDirectory, expectedFileName);
if (fs.existsSync(filePath) && filePath.endsWith(".txt")) {
console.log(`File name and type verification passed: ${expectedFileName}`);
} else {
throw new Error(`File name or type verification failed: ${filePath}`);
}
Best Practices:
Content verification is crucial when downloading files that need to contain specific data, such as text files, PDFs, or images. We need to ensure that the downloaded file not only has the correct name but also contains the expected content.
For text files, verifying the content is straightforward. We can read the file and check if it contains specific keywords or phrases.
Example (Selenium - Python):
with open(file_path, 'r') as file:
content = file.read()
if "expected text" in content:
print("Text content verification passed")
else:
raise Exception("Text content verification failed")
Example (Playwright - JavaScript):
const fileContent = fs.readFileSync(filePath, "utf-8");
if (fileContent.includes("testing download")) {
console.log("Text content verification passed");
} else {
throw new Error("Text content verification failed");
}
To verify the content of PDFs, we can use libraries like PyPDF2 in Python to extract and check the text inside the PDF.
import PyPDF2
with open(file_path, 'rb') as file:
pdf_reader = PyPDF2.PdfReader(file)
first_page = pdf_reader.pages[0]
text = first_page.extract_text()
if "expected content" in text:
print("PDF content verification passed")
else:
raise Exception("PDF content verification failed")
Similarly, we can use the pdf-parse library for Javascript.
const fs = require("fs");
const pdf = require("pdf-parse");
const dataBuffer = fs.readFileSync(filePath);
const data = await pdf(dataBuffer);
const text = data.text;
if (text.includes("expected content")) {
console.log("PDF content verification passed");
} else {
throw new Error("PDF content verification failed");
}
For images, content verification may involve comparing the downloaded image against an expected reference image. Tools like Pillow in Python or image-diff-js in JavaScript can be used for pixel-based comparisons.
Example (Python using Pillow):
from PIL import Image, ImageChops
downloaded_image = Image.open(file_path)
reference_image = Image.open("/path/to/reference-image.png")
diff = ImageChops.difference(downloaded_image, reference_image)
if not diff.getbbox():
print("Image verification passed")
else:
raise Exception("Image verification failed")
Example (JavaScript using image-diff-js):
const { imgDiff } = require("img-diff-js");
imgDiff({
actualFilename: filePath,
expectedFilename: "/path/to/reference-image.jpg",
diffFilename: "/path/to/diff.png",
}).then((result) => {
if (result.imagesAreSame) {
console.log("Image verification passed");
} else {
throw new Error("Image verification failed");
}
});
To ensure that the downloaded file isn't empty or corrupted, checking the file size is a quick and effective method. A file that is too small may indicate that the download failed or the file is incomplete.
Example (Selenium - Python):
file_size = os.path.getsize(file_path)
if file_size > 0:
print(f"File size verification passed: {file_size} bytes")
else:
raise Exception("File size verification failed: File is empty or corrupted")
Example (Playwright - JavaScript):
const stats = fs.statSync(filePath);
const fileSizeInBytes = stats.size;
if (fileSizeInBytes > 0) {
console.log(`File size verification passed: ${fileSizeInBytes} bytes`);
} else {
throw new Error("File size verification failed: File is empty or corrupted");
}
Best Practices:
In real-world automation testing, file downloads don't always go smoothly. Various factors, such as network issues, incorrect file paths, or even server-side errors, can lead to failed or incomplete downloads. To ensure the robustness of our test scripts, it's important to handle these edge cases effectively.
Download failures can occur due to reasons such as:
To account for these potential issues, our tests should have mechanisms to detect and handle download failures gracefully.
Example (Selenium - Python):
import os
import time
def wait_for_download(file_path, timeout=30):
for _ in range(timeout):
if os.path.exists(file_path):
return True
time.sleep(1)
return False
file_path = "/path/to/downloads/sample-file.txt"
# Attempt to download the file
# (Add your Selenium download code here)
if wait_for_download(file_path):
print("File downloaded successfully")
else:
raise Exception("File download failed after timeout")
Example (Playwright - JavaScript):
const fs = require('fs');
async function waitForDownload(downloadPath, timeout = 30000) {
const startTime = Date.now();
while (Date.now() - startTime < timeout) {
if (fs.existsSync(downloadPath)) {
return true;
}
await new Promise(resolve => setTimeout(resolve, 1000));
}
return false;
}
const downloadPath = 'C:/path/to/downloads/sample-file.txt';
// Trigger the download action
const downloadCompleted = await waitForDownload(downloadPath);
if (downloadCompleted) {
console.log('Download succeeded');
} else {
throw new Error('Download failed or timed out');
}
Strategies for Handling Failures:
In some cases, the download may be complete, but the file is incorrect or corrupted. This could happen due to server-side errors or incomplete downloads. One approach to address this is to verify the file's integrity after download, such as by checking its size or content.
Example (File Integrity Check - Python):
import os
# Set a minimum file size in bytes
min_file_size = 1024
if os.path.getsize(file_path) > min_file_size:
print("File size is valid")
else:
raise Exception("Downloaded file is too small, might be corrupted")
Example (File Size Check in Playwright - JavaScript):
const stats = fs.statSync(downloadPath);
const fileSizeInBytes = stats.size;
// Set a minimum file size to ensure the file isn't corrupted
if (fileSizeInBytes > 1024) {
console.log('File size is valid');
} else {
throw new Error('Downloaded file is too small, might be corrupted');
}
A common strategy to handle intermittent issues, such as network blips or server timeouts, is to implement a retry mechanism. This allows the test to retry the download action a certain number of times before failing.
Given that I've already written a blog post on retry patterns, I encourage you to experiment with different approaches to find the optimal solution for your specific scripts.
Beyond retrying, we can handle download errors gracefully by adding error-handling mechanisms within our test scripts. This ensures the tests fail with meaningful error messages, making it easier to debug issues.
Example (Selenium - Python with Try-Except):
try:
# Trigger download and wait for file
if wait_for_download(file_path):
print("File downloaded successfully")
else:
raise Exception("File download failed after timeout")
except Exception as e:
print(f"Error occurred during download: {e}")
Example (Playwright - JavaScript with Try-Catch):
try {
// Trigger download
await downloadWithRetry(page);
} catch (error) {
console.error(`Download error: ${error.message}`);
}
1. Environment Cleanup
After tests are complete, it's essential to delete downloaded files to prevent clutter and potential interference in future tests. Regular cleanup also helps maintain available disk space, especially when dealing with large files.
2. Stable Paths and Reliable File Names
Using consistent download paths and predictable file names minimizes the chances of false negatives. A custom download directory ensures files are always saved in a known location, and validating filenames helps confirm the correct file was downloaded.
3. Performance Considerations
Verifying large or multiple downloads can slow down tests. To improve performance, limit the scope of validation (e.g., check file sizes before deeper inspection) and run parallel validations for multiple downloads. Using minimum file size thresholds can quickly rule out incomplete or corrupted files.
Automating file download verification is a critical step in ensuring that files are correctly delivered and functional in our applications. By addressing common challenges like handling downloads, verifying file content, and managing edge cases, we can build more reliable and efficient tests. Following best practices such as environment cleanup, using stable paths, and optimizing performance will enhance the stability of our automation framework. Ultimately, with the right strategies in place, we can confidently ensure that downloads don't fail and meet expectations every time.
As always, you can find code examples and a demo website on our GitHub repository for hands-on experimentation.