unix - Reason for disparate results with mdfind using Python subprocess -
i'm trying write python wrapper unix mdfind
utility. in simplest form, works well; however, cannot figure out 1 instance of odd behavior. things bit odd when running more complex queries (two or more fields). take following example:
import subprocess import itertools def test1(): cmd = "mdfind 'kmditemfsname=pandoc&&kmditemcontenttype=public.unix-executable'" shell_res = subprocess.check_output(cmd, shell=true) find_res = mdfind(content_type='public.unix-executable', name='pandoc') if shell_res == find_res: print('passed!') def mdfind(**kwargs): cmd = ['mdfind'] key, arg in kwargs.iteritems(): if key in mdattributes().keys(): md_name = mdattributes()[key]['id'] query = '='.join([md_name, arg]) cmd.append(query) if 'only_in' in kwargs: cmd.append('-onlyin') cmd.append(kwargs['only_in']) return subprocess.check_output(cmd) def mdattributes(): attributes_str = subprocess.check_output(['mdimport', '-a']) # prepare key names 4 columns keys = ('id', 'name', 'description', 'aliases') # create list of dicts, mapping ``keys`` item's columns data = [dict(itertools.izip(keys, [item.replace("'", "") item in attribute.split('\t\t')])) attribute in attributes_str.splitlines()] # coerce list of dicts large dict nested dicts metadata = {} md_dict in data: # clean key key = md_dict['id'].replace('kmditemfs', '')\ .replace('kmditem', '')\ .replace('kmd', '')\ .replace('com_', '') metadata[key] = md_dict return metadata test1()
this code pass, both straight shell command , wrapper created command output same result.
now, take example, seems me of same kind, yet doesn't work:
def test2(): cmd = """mdfind 'kmditemkind=pdf&&kmditemfsname="*epistem*"c'""" shell_res = run_shell(cmd) find_res = mdfind(kind='pdf', name='"*epistem*"c')
the straight shell command return single pdf on machine has "epistemology" in title, while wrapper made command return 13 pdfs (i have 1,000+ pdfs on machine in total). so, wrapper script filtering thousands pdfs somehow, apparently not whether *epistem*
in title.
even more oddly, command return 144 results:
subprocess.check_output(['mdfind', """kmditemkind=pdf&&kmditemfsname="*epistemolog*"c"""])
so, in short these 3 different subprocess calls give radically different numbers of results:
"""mdfind 'kmditemkind=pdf&&kmditemfsname="*epistem*"c'""" ['mdfind', 'kmditemkind=pdf', u'kmditemfsname="*epistem*"c'] ['mdfind', """kmditemkind=pdf&&kmditemfsname="*epistemolog*"c"""]
so, question: why? why subprocess.check_output()
return 1 result straight shell command (i mean command string , shell=true
set), 13 results 3 item list command, , 144 results 2 item list command? going on under covers? how can 3 item list return 1 item straight shell command does?
i sure has subtle important differences in command line argument processing pipeline. pipeline complex , when invoking command within programming language environment quite difficult obtain equivalent behavior typing command in favorite shell.
the bad thing is: depending on method target executable uses parsing command line arguments (there unfortunately -- in many cases -- no definite standard) outcome may vary depending among invocation methods. is, observation has processing , interpretation of whitespace, null characters, dashes, , quotes.
your question "why?". if want bottom of this, need @ source code of python's subprocess module , @ source code of target commands command line argument parsing code. also, might want have these reads:
- http://www.daviddeley.com/autohotkey/parameters/parameters.htm
- http://gehrcke.de/2014/02/command-line-argument-binary-data/
in order obtain conceptually equivalent behavior typing in shell, there non-ovious dead-simple workaround: create temporary shell script , invoke shell within python , provide 1 argument: path shell script. have used method in module creating systematic command line tool tests: https://github.com/jgehrcke/timegaps/blob/master/test/clitest.py
Comments
Post a Comment