1

I have fetched a file from the internet in Python and would like to save the URL that the file came from. According to this question from 2010, I would expect this using pyxattr to work:

import xattr
import plistlib

url = "https://apple.com"
out_path = "/private/tmp/testmd.txt"

with open(out_path, "w") as f:
    f.write("hi")

wherefroms = plistlib.dumps([url], fmt=plistlib.FMT_BINARY)
xattr.set(out_path, "com.apple.metadata:kMDItemWhereFroms", wherefroms)

Unfortunately, it does not:

$ mdls /tmp/testmd.txt
kMDItemFSContentChangeDate = 2022-09-20 14:29:59 +0000
kMDItemFSCreationDate      = 2022-09-20 14:12:16 +0000
kMDItemFSCreatorCode       = ""
kMDItemFSFinderFlags       = 0
kMDItemFSHasCustomIcon     = 0
kMDItemFSInvisible         = 0
kMDItemFSIsExtensionHidden = 0
kMDItemFSIsStationery      = 0
kMDItemFSLabel             = 0
kMDItemFSName              = "testmd.txt"
kMDItemFSNodeCount         = 2
kMDItemFSOwnerGroupID      = 0
kMDItemFSOwnerUserID       = 501
kMDItemFSSize              = 2
kMDItemFSTypeCode          = ""

It's missing in Finder's Get Info panel as well. But! The binary plist with the correct value HAS been added to the xattrs:

$ xattr -l /tmp/testmd.txt
com.apple.metadata:kMDItemWhereFroms: bplist00�_https://apple.com

I notice that almost every file contains an array of two URLs, one the actual download URL, and one the referer. (E.g. wikimedia.org and wikipedia.org.) So I've tried adding an array with two strings, but the behavior didn't change.

I also noticed that when I modify a URL on a file for which Finder IS showing the Where From URL, the new URL is immediately visible in the Get Info panel.

How can I get the URL to show up in Finder? Might it be something to do with caching?

3
  • You're actually mixing up 2 different things: extended attributes and Spotlight (mdls, mdutil...). Extended attributes are stored per file and some of them are imported by Spotlight into its database. If you make a modification, you need to signal Spotlight to reimport metadata from the file.
    – SteffX
    Commented Aug 14, 2023 at 10:04
  • Thanks for the insight. How does one signal Spotlight in this way?
    – Nick K9
    Commented Aug 15, 2023 at 12:15
  • 1
    Besides completely rebuilding the Spotlight database, I think 'mdimport' should do. Although, remember that it is an asynchronous process ('mds' calling different 'mdimporter's) . I don't think there is a way to be sure when the file will be processed, as of Catalina. Anyway, there are a lot of interesting things in the man pages for mdimport, mdfind, mdutil and mdls.
    – SteffX
    Commented Aug 18, 2023 at 12:03

1 Answer 1

1

In order to do this reliably, you need to use the undocumented function MDItemSetAttribute, which is possible but not straightforward to do in Python. Fortunately, the osxmetadata library has done the hard work for you. As of version 1.0.0, this is as easy as:

from osxmetadata import *
from pathlib import Path

url = "https://apple.com"
out_path = Path.home() / "test_md.txt"

with open(out_path, "w") as f:
    f.write("hi")

meta = OSXMetaData(out_path)
meta.kMDItemWhereFroms = [url]

One caveat is that, for some reason, this does NOT work reliably for files in /tmp.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .