PyAutoGUI: How To Automate Work With Python | Full Course With Projects!

Photo by Jason Leung on Unsplash

PyAutoGUI: How To Automate Work With Python | Full Course With Projects!

# Introduction

Are you interested in learning how to automate tasks using PyAutoGUI, even with minimal programming knowledge? Have you ever wondered if you could take automation to the next level by automating a widely popular game enjoyed by millions worldwide?

You've come to the right place! Get ready to explore the world of automation with Python. Grab your code editor, roll up your sleeves, and let's embark on this automation journey together.

And to pique your interest, why not take a sneak peek at the game we'll be automating? Here's a link to the game to motivate you for the course!

# What Will You Do?

You’ll write a program to automate the process of creating files on your PC, and even go above and beyond to automate the T-rex runner (dinosaur) game.

# What Will You Learn?

This course was tailored to help you build a solid foundation in PyAutoGUI, so you can build from what you learn and apply the knowledge to any real-world repetitive task you by any means encounter. By the end of this course, you’ll be able to:

  • Automate Computer tasks with basic python scripts.

  • Automate mouse and keyboard operations to click and type for you.

  • Automate tasks in word processing apps (e.g., MS Word, Notepad, etc.), such as text editing, formatting, and more.

  • Dive into the world of game automation and automate simple games.

# Pre-requisites

  • No prior Python knowledge is necessary

  • No programming experience is required

  • Strong passion and willingness to code.

  • Any computer and OS: Windows, macOS, or Linux.

# Who is this Course for?

- Individuals with no prior programming knowledge

- Programmers

- Anyone interested in learning how to automate computer work.

- Gamers

- Office Workers

- Students

- Small business owners

# Getting started with Pyautogui

In order for us to get started and move on to the substantive part of this course, just as you need to hit the kitchen before you get a meal prepared to eat, we need to set up our development environment. A code editor, often likened to a kitchen where raw food ingredients are gathered and processed into mouth watering dishes, is essential for this.

# Setting up a PyAutoGUI environment

For this project, we are using Visual Studio Code, commonly known as VS code, as our code editor. VS code is a free, open source, lightweight code editor that supports almost every programming language. As of the time of writing this article, VS has stood out as the most widely used code editor among programmers worldwide. So now let’s dive in to install VS code on our PC.

# VS Code Installation

To get started with VS code, you will need to carefully follow this installation guide based on the operating system you use

Operating SystemInstallation Link
Windows OS[Download VS Code for Windows]
macOS[Download VS Code for macOS]
Linux[Download VS Code for Linux]

Now, I assume you’ve successfully installed VS code to your computer. Congratulations! It’s now time we move further in this lesson and install Python, which is like an ingredient you can use for cooking in our VS Code kitchen.

# Python Installation

Now, it’s time to install Python, which is the programming language we will be using to speak to our computer to carry out random boring tasks for us. So yes! Just before hiring our digital “nanny,” we need to understand and speak the nanny’s language. Python is the language we would use to speak with PyAutoGUI, our digital “nanny”. So now, let’s configure this language on our PC.

To successfully install Python on your PC, you will need to follow this table guide. Depending on your operating system, you will need to follow the link to the specified URL in the table below that will serve as a guide for you to install Python:

Operating SystemInstallation Link
Windows OS[Download Python for Windows]
macOS[Download Python for macOS]
Linux[Download Python for Linux]

Now that you've installed VS code and Python, let's jump into the processing of getting PyAutoGUI, our digital “nanny”, to work in our kitchen, and help us automate boring daily tasks on our computer.

# About Python PyAutoGUI library

PyAutoGUI is a library in Python that serves as a graphic user interface (GUI) automation package. PyAutoGUI can be used to automate any task on your computer, by simply letting it take control over your mouse and keyboard operations.

As I’ve mentioned earlier in this course, in layman's terms, the pyautogui library can be seen as a digital “nanny” to assist you as you work on your computer daily, just as you might employ a house help to manage your household tasks. Well, python (pyautogui) is that digital nanny that’ll let you explore the concept of automating your computer tasks. But the fun part of pyautogui is that you do not need to pay a dime to get it to run your work for you like you’d do with an actual nanny. All you need is to speak its language correctly, and it’ll do just anything you instruct it to.

# Installing PyAutoGUI

Before we can use the Python package PyAutoGUI, we need to install it on your PC.

To install pyautogui, open the terminal (command prompt) in your VS Code and run one of the sequence of commands listed in the table below, depending on your operating system.

Windows Operating SystemMac Operating SystemLinux Operating System
pip install pyautoguipip3 install pyobjc-corepip3 install python3-xlib

Awesome job! You’re all set with PyAutoGUI. Pat yourself on the back! Now let’s explore pyautogui and start automating tasks.

Basics of PyAutoGUI

To get started with PyAutoGUI, we will explore some of its fundamental functions. Initially, we'll write code to retrieve your screen's resolution. Subsequently, we'll proceed to determine the cursor's position wherever you place it. Let's begin.

Getting your Screen Resolution

size() method => It returns the screen’s resolution

import pyautogui

screenWidth, screenHeight = pyautogui.size()

print("The Width of the screen is:", screenWidth)

print("The Height of the screen is:", screenHeight)

In this code snippet, we actively utilize PyAutoGUI to determine and display the screen's dimensions. Users should note that the output may vary depending on their screen's dimensions. By using pyautogui.size(), we obtain the screenWidth and screenHeight value.

Getting Cursor Position

position() method returns the current cursor position on the screen

import pyautogui

currentMouseX, currentMouseY = pyautogui.position()

print("The X Coordinate is: ", currentMouseX)

print("The Y Coordinate is: ", currentMouseY)

When you run this code, it actively captures the current position of your mouse cursor on the screen. This means that it provides the X and Y coordinates of your mouse cursor's location.

For instance, if you place your mouse cursor in the top-left corner of your screen, the output will be '0,0.' This indicates that the X coordinate is 0 (representing the left edge) and the Y coordinate is also 0 (representing the top edge).

Running this code captures your mouse cursor's position on the screen, providing X and Y coordinates. These coordinates represent the cursor's location, with (0,0) in the top-left corner. X increases as you move right, and Y increases as you move down. The code prints these coordinates, enabling mouse tracking and automation. This code is a handy way to track and automate mouse movements in various applications.

# PyAutoGUI Mouse Operations

In this section, we will write a Python script that allows our computer to take control of mouse operations and execute our instructions precisely.

When working with PyAutoGUI for automating mouse operations, we primarily work with coordinates. Coordinates specify the screen position where the mouse should perform the desired operation. Having an understanding of size() and position() will be valuable in this section.

moveTo method

moveTo() method is used to move the mouse to new coordinates (x, y) smoothly over the specified duration in seconds.

Example

import pyautogui

pyautogui.move(400, 0, duration=1.0)  # This will move the mouse over 1 second.

The mouse moves 400 pixels to the right (x-axis) and does not move vertically (y-axis, set at 0), and this all happens within 1 second duration.

click method

The click() method simulates a mouse click event, operating at the specified x and y coordinates in its argument. If you do not intend to specify the x and y coordinates, you can invoke it shortly after the moveTo() method, which positions the mouse on the intended object for clicking.

To perform multiple clicks, you can use the clicks="n" parameter in its argument, where "n" represents the number of times you want the click() method to interact with the object.

To specify the button you want to click, whether it's the left, right, or middle button, you should use the button argument of the method.

For example, the code click(250, 600, clicks=5, button='right') simulates a right-click event at coordinates 250, 600, five times.

Example

import pyautogui

# Simulate a right-click five times at coordinates (100, 100)
pyautogui.click(x=100, y=100, clicks=5, button='right')

dragTo method

The dragTo() method is employed to hold and move an element on the screen, typically mimicking a mouse click-and-drag operation.

Here's a detailed explanation of the dragTo method format:

  • x: The x-axis point to which the mouse will be dragged.

  • y: The y-axis point to which the mouse will be dragged.

  • button: Specifies which mouse button to hold down during the drag operation, usually 'left,' 'middle,' or 'right.'

  • duration: Indicates the duration in seconds over which the drag operation will occur.

Example

import pyautogui

pyautogui.dragTo(500, 300, button='left', duration='2')

This code simulates dragging the mouse cursor to the coordinates (500, 300) while holding the left mouse button for 2 seconds.

scroll method

The scroll() method lets you simulate scrolling past contents on the screen and, optionally, specify the location (x and y coordinates) where the scrolling action should occur.

Example

import pyautogui

pyautogui.scroll(10)   # scroll up 10 "clicks"

pyautogui.scroll(-10)  # scroll down 10 "clicks"

pyautogui.scroll(10, x=100, y=200)  # move mouse cursor to 100, 200, then scroll up 10 "clicks"

In this section, we've explored fundamental mouse-related methods in PyAutoGUI, including moveTo, click, dragTo, and scroll. These functions form the basis for automating various mouse-driven tasks, ranging from basic cursor movements to intricate interactions. As we move on to the keyboard course section, it's essential to recognize that effective automation frequently entails a blend of both mouse and keyboard actions.

# PyAutoGUI Keyboard Operations

In this section, we'll learn how to simulate keystrokes, keyboard shortcuts, and text input with ease. Whether you're automating data entry, controlling applications, or interacting with text-based interfaces, mastering keyboard automation is an essential skill. From here, we’ll take the knowledge further to automate real-world projects.

typewrite method

The typewrite() method is employed to simulate keyboard inputs by typing text into text-based interfaces.

Typically, the typewrite() method complements keyboard operations to simulate actions. We will use these combined operations to automate a straightforward process of searching within a desktop application.

Example

import pyautogui
import time

# Cursor moves to the Windows search bar for 2 seconds duration
pyautogui.moveTo(8, 756, 2)

# Add a delay to ensure the cursor moves completely
time.sleep(1)  

# Clicks on the windows search bar
pyautogui.click()

# Add a delay to ensure the search bar is open
time.sleep(1)  

# Type "Google Chrome" into the search bar
pyautogui.typewrite("Google Chrome")

In this example, we automated a sequence of actions for searching a desktop application on a Windows computer. It first moves the cursor to the Windows search bar, clicks on it, and then types "Google Chrome" into the search bar, effectively simulating the process of searching the Google Chrome web browser via automated mouse and keyboard actions.

press method

The press() method is employed to simulate pressing keys on the keyboard. For example, you can select any key from your keyboard and use it as an argument for this method.

To list all the keys on the keyboard, you can use the pyautogui.KEYBOARD_KEYS command in conjunction with the print() function.

hotkey method

The hotkey() method serves to simulate the simultaneous pressing of multiple keys. It allows you to replicate various keyboard shortcuts, such as copying (Ctrl + C), pasting (Ctrl + V), or cutting (Ctrl + X).

Example

Let's enhance the automation process in the typewrite() example. In this scenario, we'll eliminate the need for the moveTo() mouse operation and instead utilize the hotkey() method. Rather than manually moving to the search bar and clicking, we'll simply invoke the hotkey() method for the Windows key (Win key), which opens the Windows Start menu. From there, we'll utilize the typewrite() method to input text into the search bar (the text interface) and use the press() method to trigger the 'Enter' key.

Additionally, we'll leverage different key combinations within the hotkey() method to access the address bar, type in text using the typewrite() method, and then press 'Enter' to initiate a search on the Google URL.

import pyautogui

import time

# Open the Windows search bar
pyautogui.hotkey('win')

# Add a delay to ensure the search bar is open
time.sleep(1)  

# Type "Google Chrome" into the search bar
pyautogui.typewrite("Google Chrome")

# Add a delay to ensure the search results appear
time.sleep(1)  

# Select Google Chrome from the search results
pyautogui.press('enter')

# Wait for Google Chrome to open (adjust the sleep time as needed)
time.sleep(5)

# Perform a search on google.com in the Chrome address bar
pyautogui.hotkey('ctrl', 'l')  # Select the address bar

pyautogui.typewrite("https://www.google.com")

pyautogui.press('enter')

This code sequence does the following:

1. Opens the Windows search bar using the 'win' key.

2. Types "Google Chrome" into the search bar.

3. Presses 'Enter' to open Google Chrome.

4. After Google Chrome opens, it waits for 5 seconds (you may need to adjust the sleep time based on your system's speed).

5. It then selects the address bar by pressing 'Ctrl' + 'L' to focus on it.

6. Types "google.com" into the address bar.

7. Finally, it presses 'Enter' to initiate the search on google.com.

# Real-World Projects

As we progress in our pyautogui journey, we have advanced from automating mouse operations to simulating keyboard inputs for tasks such as searching for apps on our system and navigating to Google's website. Now, we are ready to tackle two real-world automation projects:

  • Word Editor Operation Automation Project

  • T-rex Runner Game Automation Project

Project 1: Word Editor Operation Automation Project

Task: Write a python script to open a notepad file on your pc, expand the window, then populate it with some random text, and save the file name as “freecodecamp_msg”.

Solution:

# Import required libraries

import pyautogui  # For automating user input

import time  # For adding time delays

# Open the Windows Start menu

print(pyautogui.press('win'))

time.sleep(1)

# Type 'notepad' to search for Notepad application
pyautogui.typewrite("notepad")

time.sleep(1)

# Open Notepad by pressing 'Enter'
pyautogui.press('enter')

time.sleep(1)

# Maximize the Notepad window
pyautogui.hotkey('win', 'up')

# Define the recipient's name
recipient_name = "Code Camper"

# Create a text message
message = '''

Subject: Welcome to FreeCodeCamp - Automating File Management - Project Lesson

Dear {},

I hope this message finds you well. I am excited to welcome you to FreeCodeCamp, where you'll embark on an exciting journey into the world of coding and automation. Congratulations on taking this step towards expanding your knowledge and skill set!

Today, you've just delved into an essential aspect of coding - automating file management. This skill is invaluable, as it allows you to streamline repetitive tasks and increase your productivity. Whether you're organizing documents, sorting files, or processing data, automation can significantly simplify your workflow.

With the power of coding, you have the potential to save time, reduce errors, and tackle more complex challenges. As you explore the world of programming, you'll discover that automation is a key driver of efficiency in various industries.

Here at FreeCodeCamp, we're committed to helping you along this coding journey. Our community is a vibrant hub of learners, educators, and enthusiasts who are passionate about sharing knowledge and supporting one another. Don't hesitate to reach out if you have questions, need guidance, or simply want to connect with like-minded individuals.

Keep in mind that learning to code is an ongoing process, and every small step you take is a victory. So, embrace the journey, stay curious, and never stop exploring new possibilities.

Once again, welcome to FreeCodeCamp! We're thrilled to have you here, and we can't wait to see the amazing things you'll achieve.

Happy coding!

Warm regards,

Daniel Olasupo

'''.format(recipient_name)

# Type the text message into Notepad
pyautogui.typewrite(message)

# Save the text message using the keyboard shortcut 'Ctrl + S'
pyautogui.hotkey('ctrl', 's')

# Provide a file name for saving the message
pyautogui.typewrite("freecodecamp_msg")

# Confirm the save action by pressing 'Enter'
pyautogui.press('enter')

Project 2: T-rex Runner Game Automation Project

Task: Write a python script to automate playing the popular “T-rex runner” or “dinosaur” game on the t-rex website

Solution:

import pyautogui
import time
import keyboard

# pyautogui.displayMousePosition()

while True:

img = pyautogui.screenshot()

screen = img.getpixel((163, 415))

x1 = img.getpixel((568, 523))

x2 = img.getpixel((501, 523))

x3 = img.getpixel((588, 523))

x4 = img.getpixel((473, 523))

y1 = img.getpixel((502, 469))

y2 = img.getpixel((515, 469))

y3 = img.getpixel((465, 469))

y4 = img.getpixel((455, 405))

# check screen background (very light gray color)
if screen[0] == 247:

# check for light mode
if x1[0] != 247 or x2[0] != 247 or x3[0] != 247 or x4[0] != 247 or y1[0] != 247 or y2[0] != 247 or y3[0] != 247 or y4[0] != 247:
pyautogui.press('space')
time.sleep(0.0001)
else:
    # check for dark mode              
    if x1[0] != 0 or x2[0] != 0 or x3[0] != 0 or x4[0] != 0 or y1[0] != 0 or y2[0] != 0 or y3[0] != 0 or y4[0] != 0:
        pyautogui.press('space')
        time.sleep(0.0001)

    if keyboard.is_pressed('s'):
        break
    else:
        pass

Note: While we've covered two exciting projects in this article, the second project, which involves automating a dinosaur game, is quite extensive and requires a detailed explanation. To provide you with an in-depth understanding of this project, we've created a separate article. You can access the detailed explanation and code walkthrough by clicking here. We encourage you to explore it for a comprehensive understanding of the code and the exciting automation possibilities it offers.

# Conclusion

In conclusion, this course has equipped you with the essential skills and knowledge to harness the power of PyAutoGUI for automating a wide range of computer tasks. You've learned how to create basic Python scripts that can automate repetitive actions, taking the burden off your shoulders. From automating mouse and keyboard operations to simplifying tasks in word processing applications like MS Word and Notepad, you now have the tools to work more efficiently and save valuable time.

Moreover, this course has introduced you to the exciting realm of game automation, allowing you to automate simple games and explore the endless possibilities of PyAutoGUI in various contexts.

With these newfound skills, you are well-prepared to tackle real-world tasks and streamline your workflow, making your computing experience more productive and enjoyable. So, go forth and apply your knowledge with confidence, knowing that you have the ability to make your digital life easier and more automated.