-r Recursive option#129
Conversation
| excluded_files = args.excluded_paths | ||
| test = discover_files(directory_path, excluded_files) | ||
|
|
||
| print(test) |
There was a problem hiding this comment.
just to see if it works
|
Hi @omergunal, thanks for the soup! :D Can you merge master into this branch and then resolve the merge conflicts? 👍 |
|
No problem :) , ok i will do it |
|
in usage.py "filepath" is must required. should we do optional? because if we use "-r" we do not need it |
|
Can you also add parser.add_argument(
'targets', metavar='targets', type=str, nargs='*',
help='source file(s) or directory(s) to be tested'
) |
|
Let's do this https://github.com/PyCQA/bandit/blob/master/bandit/cli/main.py#L153-L160 and then remove |
|
I realize I'm totally changing my mind from what I said before, "This will enable a user to just give -r /path/to/files instead of -f file one at a time." but this seems cleaner. |
|
i.e. |
|
You mean, we will delete "-f" option and use "-r" for both single file scan and directory scan.And if user will not use any parameter, it will scan one file. Is that correct? |
| action='store', | ||
| default='', | ||
| help='Separate files with commas' | ||
| ) |
| help='do not skip lines with # nosec comments' | ||
| ) | ||
|
|
||
| optional_group.add_argument( |
There was a problem hiding this comment.
Maybe make it
parser.add_argument(
'-r', '--recursive', dest='recursive',
action='store_true', help='find and process files in subdirectories'
)
|
|
||
|
|
||
| def discover_files(directory_path, excluded_files): | ||
| file_list = [] |
There was a problem hiding this comment.
Nit: We mostly use list() everywhere in assignments in the codebase, just for consistency.
| if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list: | ||
| file_list.append(fullpath) | ||
|
|
||
| return(file_list) |
There was a problem hiding this comment.
Nit: just for consistency, you can do return included_files (and I guess rename files to included_files)
|
re:"You mean, we will delete Delete def discover_files(targets, excluded_files, recursion):
included_files = list()
excluded_list = excluded_files.split(",")
for target in targets:
if target.endswith('.py'):
included_files.append(target)
else:
for root, dirs, files in os.walk(target):
for f in files:
fullpath = os.path.join(root, f)
if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list:
included_files.append(fullpath)
if not recursion:
break
return included_files |
|
(just updated the code, should be better now.) |
|
it seems good about returning "included_list". also "-x " parameter is available. |
|
|
||
|
|
||
|
|
||
| targets = args.targets |
There was a problem hiding this comment.
I think it might be more DRY if you did
files = discover_files(
args.targets,
args.excluded_paths,
args.recursive
)| default='', | ||
| help='Separate files with commas' | ||
| ) | ||
| optional_group.add_argument( |
There was a problem hiding this comment.
I guess targets will be part of _add_required_group b/c it's replacing -f files
KevinHock
left a comment
There was a problem hiding this comment.
Almost there, looking good :D
| directory = os.path.dirname(path) | ||
| project_modules = get_modules(directory) | ||
| local_modules = get_directory_modules(directory) | ||
| for path in files: |
There was a problem hiding this comment.
Before this for loop, you can have a vulnerabilities = list(), and then do
vulnerabilities.append(find_vulnerabilities(
cfg_list,
ui_mode,
args.blackbox_mapping_file,
args.trigger_word_file,
nosec_lines
))There was a problem hiding this comment.
Does it find vulnerabilities in all the files, or just the last file (i.e. last iteration of the loop)? If I'm reading it write I think it might do e.g. vulnerabilites = kitmap_vulns...then finally vulnerabilities = a.py_vulns, and only report the last list.
There was a problem hiding this comment.
(As an aside, it seems strange it's not printing out the vulnerability info, and just seems to print the object.)
There was a problem hiding this comment.
You are right, it taking last item on the list. have you idea for fix?
There was a problem hiding this comment.
I think the fix is having a list outside of the for loop and adding the vulnerabilities of each file to it. The code from my first comment should do it, although now that I think about it it's probably extend and not append.
There was a problem hiding this comment.
^Ahh, this was it! 👍 Just change append to extend and it'll all work! :D
| nosec_lines | ||
| ) | ||
|
|
||
| if args.baseline: |
There was a problem hiding this comment.
You can de-dent this, if args.baseline: as only one call to get_vulnerabilities_not_in_baseline, with the vulnerabilities of every file will work.
| args.excluded_paths, | ||
| args.recursive | ||
| ) | ||
| vulnerabilities = list() |
There was a problem hiding this comment.
Hmm, That's odd, I'll checkout/test your code later today to try and find the issue. 👍
There was a problem hiding this comment.
That's really weird, I'll look more in-depth on Monday 👍
There was a problem hiding this comment.
I checked out your branch and it was append vs. extend
KevinHock
left a comment
There was a problem hiding this comment.
Super close, just the de-dent and append vs. extend and I think that's mostly it :)
| local_modules = get_directory_modules(directory) | ||
| tree = generate_ast(path) | ||
|
|
||
| if args.baseline: |
There was a problem hiding this comment.
You can keep this, but just de-dent it so that we only trim once.
|
So I looked at the tests that were failing and the You can do e.g. class MainTest(BaseTestCase):
+ @mock.patch('pyt.__main__.discover_files')
@mock.patch('pyt.__main__.parse_args')
@mock.patch('pyt.__main__.find_vulnerabilities')
@mock.patch('pyt.__main__.text')
- def test_text_output(self, mock_text, mock_find_vulnerabilities, mock_parse_args):
+ def test_text_output(self, mock_text, mock_find_vulnerabilities, mock_parse_args, mock_discover_files):
mock_find_vulnerabilities.return_value = 'stuff'
example_file = 'examples/vulnerable_code/inter_command_injection.py'
output_file = 'mocked_outfile'
+ mock_discover_files.return_value = [example_file]
mock_parse_args.return_value = mock.Mock(
autospec=True,
- filepath=example_file,
project_root=None,
baseline=None,
json=None,and the same for the other tests. This makes it so that in the tests, we don't really ever call |
| ) | ||
| initialize_constraint_table(cfg_list) | ||
| analyse(cfg_list) | ||
| vulnerabilities.extend(find_vulnerabilities( |
There was a problem hiding this comment.
No, there are no vulnerability in a.py b.py and c.py but it printing from xss.py
There was a problem hiding this comment.
I didn't figure out the bug yet, gonna look more tomorrow 😁 This is harder than expected to track down
There was a problem hiding this comment.
ok, i fixed it. its about vulnerabilities = list() location
KevinHock
left a comment
There was a problem hiding this comment.
Can you write tests for discover_files when you get a chance? 👍
| included_files.append(fullpath) | ||
| else: | ||
| if target not in excluded_list: | ||
| included_files.append(targets[0]) |
There was a problem hiding this comment.
So if targets is a list of files, e.g. python -m pyt examples/vulnerable_code/command_injection.py examples/vulnerable_code/XSS.py, then discover_files will return the first file N times. (Where N is the len of targets.)
|
|
||
| for target in targets: | ||
| if os.path.isdir(target): | ||
| if recursive: |
There was a problem hiding this comment.
So having if recursive: here it will make it so that if you don't have -r then you won't search directories.
You can change it to:
def discover_files(targets, excluded_files, recursive=False):
included_files = list()
excluded_list = excluded_files.split(",")
for target in targets:
if os.path.isdir(target):
for root, dirs, files in os.walk(target):
for f in files:
fullpath = os.path.join(root, f)
if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list:
included_files.append(fullpath)
if not recursive:
break
else:
if target not in excluded_list:
included_files.append(target)
return included_files| args.recursive | ||
| ) | ||
| for path in files: | ||
| vulnerabilities = list() |
There was a problem hiding this comment.
So the bug I found yesterday, or more accurately the thing I don't understand 😕 , is that find_vulnerabilities returns the vulnerabilities for all the files previously analyzed, as if find_vulnerabilities knows all the vulnerabilities found for the other files we've looked at, how does it know this? 😱
There was a problem hiding this comment.
So as far as the PR, you can change it to what you had originally, i.e. vulnerabilities = find_vulnerabilities(..), sorry I misunderstood the code, I'll still look into the reason why the code does this though.
There was a problem hiding this comment.
Aha, figured it out, so I knew constraint_table etc. were global variables that keep state, and that they could be the culprit if they were used weirdly, however it is due to FrameworkAdaptor https://github.com/python-security/pyt/blob/master/pyt/web_frameworks/framework_adaptor.py#L88 adding all the past CFGs to the list :) I'll add a comment to __main__.py about it after we merge this PR
| if os.path.isdir(target): | ||
| for root, dirs, files in os.walk(target): | ||
| for f in files: | ||
| if not recursive: |
There was a problem hiding this comment.
It's important for the if not recursive: to be after the
fullpath = os.path.join(root, f)
if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list:
included_files.append(fullpath)this is so that we only iterate through the for f in files: once. i.e. just one-level of depth and not recursively.
| vulnerabilities, | ||
| args.baseline | ||
| ) | ||
| vulnerabilities = get_vulnerabilities_not_in_baseline( |
There was a problem hiding this comment.
Thank you for de-denting the if args.baseline: You can de-dent the vulnerabilities = get_vulnerabilities_not_in_baseline( too, 👍
|
|
||
| for target in targets: | ||
| if os.path.isdir(target): | ||
| for root, dirs, files in os.walk(target): |
There was a problem hiding this comment.
I think this line, for root, dirs, files in os.walk(target): is indented one more level than it has to be.
There was a problem hiding this comment.
i will try to do better for returning "included_files"
KevinHock
left a comment
There was a problem hiding this comment.
LGTM, just make the tests pass and I'll merge 👍 (Feel free to write tests for discover_files if you'd like to though.)
| excluded_list = excluded_files.split(",") | ||
| for target in targets: | ||
| if os.path.isdir(target): | ||
| for root, dirs, files in os.walk(target): |
There was a problem hiding this comment.
Nit: You can de-dent from line 38 to line 44
| args.baseline | ||
| ) | ||
|
|
||
|
|
There was a problem hiding this comment.
Nit: You can delete this newline
|
and its done. i will write test for |
KevinHock
left a comment
There was a problem hiding this comment.
So happy to merge 🎊 🎈 🎉 🎂
…ll versions, add travis commands to tox so this does not happen again




Issue: #127
There is a few steps for completing this PR. Now we can get all ".py" files in directory and exclude some files with "-x" option.