python获取屏幕坐标,在OS X Python中获取屏幕像素值-CSDN博客

本文探讨了在Mac OS X 10.8.2上使用Python进行自动化游戏bot开发时，如何优化屏幕截图获取像素颜色的方法。作者介绍了autopy的鼠标操作API和其存在的问题，并提出使用PyObjC结合CGWindowListCreateImage技术，实现快速且内存访问的屏幕像素获取，同时分享了性能测试和代码实例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

I'm in the process of building an automated game bot in Python on OS X 10.8.2 and in the process of researching Python GUI automation I discovered autopy. The mouse manipulation API is great, but it seems that the screen capture methods rely on deprecated OpenGL methods...

Are there any efficient ways of getting the color value of a pixel in OS X? The only way I can think of now is to use os.system("screencapture foo.png") but the process seems to have unneeded overhead as I'll be polling very quickly.

解决方案

A small improvement, but using the TIFF compression option for screencapture is a bit quicker:

$ time screencapture -t png /tmp/test.png

real 0m0.235s

user 0m0.191s

sys 0m0.016s

$ time screencapture -t tiff /tmp/test.tiff

real 0m0.079s

user 0m0.028s

sys 0m0.026s

This does have a lot of overhead, as you say (the subprocess creation, writing/reading from disc, compressing/decompressing).

Instead, you could use PyObjC to capture the screen using CGWindowListCreateImage. I found it took about 70ms (~14fps) to capture a 1680x1050 pixel screen, and have the values accessible in memory

A few random notes:

Importing the Quartz.CoreGraphics module is the slowest part, about 1 second. Same is true for importing most of the PyObjC modules. Unlikely to matter in this case, but for short-lived processes you might be better writing the tool in ObjC

Specifying a smaller region is a bit quicker, but not hugely (~40ms for a 100x100px block, ~70ms for 1680x1050). Most of the time seems to be spent in just the CGDataProviderCopyData call - I wonder if there's a way to access the data directly, since we dont need to modify it?

The ScreenPixel.pixel function is pretty quick, but accessing large numbers of pixels is still slow (since 0.01ms * 1650*1050 is about 17 seconds) - if you need to access lots of pixels, probably quicker to struct.unpack_from them all in one go.

Here's the code:

import time

import struct

import Quartz.CoreGraphics as CG

class ScreenPixel(object):

"""Captures the screen using CoreGraphics, and provides access to

the pixel values.

"""

def capture(self, region = None):

"""region should be a CGRect, something like:

>>> import Quartz.CoreGraphics as CG

>>> region = CG.CGRectMake(0, 0, 100, 100)

>>> sp = ScreenPixel()

>>> sp.capture(region=region)

The default region is CG.CGRectInfinite (captures the full screen)

"""

if region is None:

region = CG.CGRectInfinite

else:

# TODO: Odd widths cause the image to warp. This is likely

# caused by offset calculation in ScreenPixel.pixel, and

# could could modified to allow odd-widths

if region.size.width % 2 > 0:

emsg = "Capture region width should be even (was %s)" % (

region.size.width)

raise ValueError(emsg)

# Create screenshot as CGImage

image = CG.CGWindowListCreateImage(

region,

CG.kCGWindowListOptionOnScreenOnly,

CG.kCGNullWindowID,

CG.kCGWindowImageDefault)

# Intermediate step, get pixel data as CGDataProvider

prov = CG.CGImageGetDataProvider(image)

# Copy data out of CGDataProvider, becomes string of bytes

self._data = CG.CGDataProviderCopyData(prov)

# Get width/height of image

self.width = CG.CGImageGetWidth(image)

self.height = CG.CGImageGetHeight(image)

def pixel(self, x, y):

"""Get pixel value at given (x,y) screen coordinates

Must call capture first.

"""

# Pixel data is unsigned char (8bit unsigned integer),

# and there are for (blue,green,red,alpha)

data_format = "BBBB"

# Calculate offset, based on

# http://www.markj.net/iphone-uiimage-pixel-color/

offset = 4 * ((self.width*int(round(y))) + int(round(x)))

# Unpack data from string into Python'y integers

b, g, r, a = struct.unpack_from(data_format, self._data, offset=offset)

# Return BGRA as RGBA

return (r, g, b, a)

if __name__ == '__main__':

# Timer helper-function

import contextlib

@contextlib.contextmanager

def timer(msg):

start = time.time()

yield

end = time.time()

print "%s: %.02fms" % (msg, (end-start)*1000)

# Example usage

sp = ScreenPixel()

with timer("Capture"):

# Take screenshot (takes about 70ms for me)

sp.capture()

with timer("Query"):

# Get pixel value (takes about 0.01ms)

print sp.width, sp.height

print sp.pixel(0, 0)

# To verify screen-cap code is correct, save all pixels to PNG,

# using http://the.taoofmac.com/space/projects/PNGCanvas

from pngcanvas import PNGCanvas

c = PNGCanvas(sp.width, sp.height)

for x in range(sp.width):

for y in range(sp.height):

c.point(x, y, color = sp.pixel(x, y))

with open("test.png", "wb") as f:

f.write(c.dump())