Save the 2D bounding box of an object in rendered image to a text file

Question

I have a blend file with a moving object and want to save the object's bounding box positions (x, y, width and height) in the rendered image to a text file (for every frame):

Determine moving object's bounding box in rendered image

The bounding box values need to be in the pixelspace of the rendered 2D image, where x and y are the location of the image's pixel representing the top left corner, while widthand height represent the number of pixels from the top left corner to the right bottom corner.

CodeManX · Accepted Answer · 2014-02-20 13:44:44Z

This script determines the camera space bounding box and calculates the top left corner location in the rendered image + width and height (clamped to render resolution).

It will return incorrect results if the object is partially behind the camera, it should work if it's in front and crossing the camera view border however.

import bpy
from mathutils import Vector

class Box:

    dim_x = 1
    dim_y = 1

    def __init__(self, min_x, min_y, max_x, max_y, dim_x=dim_x, dim_y=dim_y):
        self.min_x = min_x
        self.min_y = min_y
        self.max_x = max_x
        self.max_y = max_y
        self.dim_x = dim_x
        self.dim_y = dim_y

    @property
    def x(self):
        return round(self.min_x * self.dim_x)

    @property
    def y(self):
        return round(self.dim_y - self.max_y * self.dim_y)

    @property
    def width(self):
        return round((self.max_x - self.min_x) * self.dim_x)

    @property
    def height(self):
        return round((self.max_y - self.min_y) * self.dim_y)

    def __str__(self):
        return "<Box, x=%i, y=%i, width=%i, height=%i>" % \
               (self.x, self.y, self.width, self.height)

    def to_tuple(self):
        if self.width == 0 or self.height == 0:
            return (0, 0, 0, 0)
        return (self.x, self.y, self.width, self.height)


def camera_view_bounds_2d(scene, cam_ob, me_ob):
    """
    Returns camera space bounding box of mesh object.

    Negative 'z' value means the point is behind the camera.

    Takes shift-x/y, lens angle and sensor size into account
    as well as perspective/ortho projections.

    :arg scene: Scene to use for frame size.
    :type scene: :class:`bpy.types.Scene`
    :arg obj: Camera object.
    :type obj: :class:`bpy.types.Object`
    :arg me: Untransformed Mesh.
    :type me: :class:`bpy.types.Mesh´
    :return: a Box object (call its to_tuple() method to get x, y, width and height)
    :rtype: :class:`Box`
    """

    mat = cam_ob.matrix_world.normalized().inverted()
    me = me_ob.to_mesh(scene, True, 'PREVIEW')
    me.transform(me_ob.matrix_world)
    me.transform(mat)

    camera = cam_ob.data
    frame = [-v for v in camera.view_frame(scene=scene)[:3]]
    camera_persp = camera.type != 'ORTHO'

    lx = []
    ly = []

    for v in me.vertices:
        co_local = v.co
        z = -co_local.z

        if camera_persp:
            if z == 0.0:
                lx.append(0.5)
                ly.append(0.5)
            # Does it make any sense to drop these?
            #if z <= 0.0:
            #    continue
            else:
                frame = [(v / (v.z / z)) for v in frame]

        min_x, max_x = frame[1].x, frame[2].x
        min_y, max_y = frame[0].y, frame[1].y

        x = (co_local.x - min_x) / (max_x - min_x)
        y = (co_local.y - min_y) / (max_y - min_y)

        lx.append(x)
        ly.append(y)

    min_x = clamp(min(lx), 0.0, 1.0)
    max_x = clamp(max(lx), 0.0, 1.0)
    min_y = clamp(min(ly), 0.0, 1.0)
    max_y = clamp(max(ly), 0.0, 1.0)

    bpy.data.meshes.remove(me)

    r = scene.render
    fac = r.resolution_percentage * 0.01
    dim_x = r.resolution_x * fac
    dim_y = r.resolution_y * fac

    return Box(min_x, min_y, max_x, max_y, dim_x, dim_y)


def clamp(x, minimum, maximum):
    return max(minimum, min(x, maximum))


def write_bounds_2d(filepath, scene, cam_ob, me_ob, frame_start, frame_end):

    with open(filepath, "w") as file: 
        for frame in range(frame_start, frame_end + 1):
            bpy.context.scene.frame_set(frame)
            file.write("%i %i %i %i\n" % camera_view_bounds_2d(scene, cam_ob, me_ob).to_tuple())

def main(context):

    filepath = r"D:\temp\bounds_2d.txt"

    scene = context.scene
    cam_ob = scene.camera
    me_ob = context.object

    frame_current = scene.frame_current
    frame_start = scene.frame_start
    frame_end = scene.frame_end

    write_bounds_2d(filepath, scene, cam_ob, me_ob, frame_start, frame_end)

    scene.frame_set(frame_current)

main(bpy.context)

One could try cut the mesh at the camera view border (world space, camera space or NDC?) and throw away all coordinates outside the frame. That should fix the bounding box problem with object partially behind camera. If we dropped these coords without cutting, it would likely return a too small box (should return a too large box in these cases at the moment).

Thanks CoDEmanX for your answer. I'm just trying to run this script using this filename = "D:/Blender/bb.py" then exec(compile(open(filename).read(), filename, 'exec')) but its giving an error. Like when I press enter to exec a very large window appears and its the first time for me to happen, I took a screen shot found here to explain what happens dropbox.com/s/rkwxwk8yrzvr21q/blender_Capture.PNG so when i press double enter this error appears "SyntaxError: multiple statements found while compiling a single statement" , so if you could please advise, many thanks CoDEmanX — Tak, Commented Feb 20, 2014 at 14:17
Why don't you run it from Text Editor? Also, the filepath doesn't seem right, you need to use backslashes on windows (this is not a Blender path!), and they need to be either escaped, or the string be a raw string r"D:\Blender\..." — CodeManX, Commented Feb 20, 2014 at 14:45
So what does it write to console? (Window > Toggle System Console on Windows) — CodeManX, Commented Feb 20, 2014 at 15:01
thanks CoDEmanX, I solved it. I needed to modify the script to choose the object I want instead of using context.object, I used bpy.data.objects.get("Suzanne"). Many Thanks again CoDEmanX for your kind assistance and support — Tak, Commented Feb 20, 2014 at 15:05
I've just run it and tested the results. For frame 1 which I've uploaded the image before "Image0001.jpg", the bounding_box result says (15 421 482 260) while when I check the pixel position (15,421) it gives a pixel which is far away from the object as shown in the picture found here dropbox.com/s/wlbnrl9oyoq4jtn/blender_Capture1.PNG . Do you know why this is happening? — Tak, Commented Feb 20, 2014 at 15:18

juniorxsound · Accepted Answer · 2019-11-16 21:48:28Z

For anyone looking for a Blender 2.8 version of @CoDEmanX's answer

def clamp(x, minimum, maximum):
    return max(minimum, min(x, maximum))

def camera_view_bounds_2d(scene, cam_ob, me_ob):
    """
    Returns camera space bounding box of mesh object.

    Negative 'z' value means the point is behind the camera.

    Takes shift-x/y, lens angle and sensor size into account
    as well as perspective/ortho projections.

    :arg scene: Scene to use for frame size.
    :type scene: :class:`bpy.types.Scene`
    :arg obj: Camera object.
    :type obj: :class:`bpy.types.Object`
    :arg me: Untransformed Mesh.
    :type me: :class:`bpy.types.Mesh´
    :return: a Box object (call its to_tuple() method to get x, y, width and height)
    :rtype: :class:`Box`
    """

    mat = cam_ob.matrix_world.normalized().inverted()
    depsgraph = bpy.context.evaluated_depsgraph_get()
    mesh_eval = me_ob.evaluated_get(depsgraph)
    me = mesh_eval.to_mesh()
    me.transform(me_ob.matrix_world)
    me.transform(mat)

    camera = cam_ob.data
    frame = [-v for v in camera.view_frame(scene=scene)[:3]]
    camera_persp = camera.type != 'ORTHO'

    lx = []
    ly = []

    for v in me.vertices:
        co_local = v.co
        z = -co_local.z

        if camera_persp:
            if z == 0.0:
                lx.append(0.5)
                ly.append(0.5)
            # Does it make any sense to drop these?
            # if z <= 0.0:
            #    continue
            else:
                frame = [(v / (v.z / z)) for v in frame]

        min_x, max_x = frame[1].x, frame[2].x
        min_y, max_y = frame[0].y, frame[1].y

        x = (co_local.x - min_x) / (max_x - min_x)
        y = (co_local.y - min_y) / (max_y - min_y)

        lx.append(x)
        ly.append(y)

    min_x = clamp(min(lx), 0.0, 1.0)
    max_x = clamp(max(lx), 0.0, 1.0)
    min_y = clamp(min(ly), 0.0, 1.0)
    max_y = clamp(max(ly), 0.0, 1.0)

    mesh_eval.to_mesh_clear()

    r = scene.render
    fac = r.resolution_percentage * 0.01
    dim_x = r.resolution_x * fac
    dim_y = r.resolution_y * fac

    # Sanity check
    if round((max_x - min_x) * dim_x) == 0 or round((max_y - min_y) * dim_y) == 0:
        return (0, 0, 0, 0)

    return (
        round(min_x * dim_x),            # X
        round(dim_y - max_y * dim_y),    # Y
        round((max_x - min_x) * dim_x),  # Width
        round((max_y - min_y) * dim_y)   # Height
    )

# Print the result
print(camera_view_bounds_2d(context.scene, context.scene.camera, context.object))

Thanks for the script! So from what I understand this operates on the mesh and will ignore the scale set on the object. Do you know any way to include the scale? — paulgavrikov, Commented Aug 20, 2020 at 10:43

Stack Exchange Network

Save the 2D bounding box of an object in rendered image to a text file

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
animation
export
objects
image
.

Linked

Hot Network Questions

Save the 2D bounding box of an object in rendered image to a text file

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged animationexportobjectsimage.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
animation
export
objects
image
.