Identify renamed and modified file in Git using SHA1 - c#

I'm hacking around git repository at low-level, trying to retrieve file's history from it. And having difficulties identifying file modified and renamed in a same revision.
I'm developing C# application and I need to implement git log --follow FILENAME feature.
Modification is simple: search for file with given path in trees attached to revision, if SHA1 differs — Voilà!
Rename is simple too: if search by given path was not successful — look for object with same SHA1, as previously, if found — Voilà!
But if not found it might be either file deletion and my search is over, or rename and modify in same revision... but how to distinguish between these cases?
I've studied everything I found regarding Git internals, but still cannot find out what to do in this case, what might be common between tree objects corresponding to the same modified and renamed file in different revisions?
Many thanks in advance for your help!

Git allready has that functionality. See -M/--find-renames, -C/--find-copies and -C -C/--find-copies-harder options to diff (applies to log and show as well) and --follow option to log.
The principle of --find-renames is, that if it sees new file in a revision, it looks at the files removed in that revision, compares them and if any is similar enough, declares it a rename.
Edit: In more details: To detect copies/renames, git compares the two revision first it compares the lists of files. Than for each path that only appears in the new revision it compares the content with content of files from old revision that -M—were deleted, -C—were modified or -C—all and if they are similar enough (which requires diff), marks it as rename or copy as appropriate. This is part of the diff core and is available to all commands that show diffs in any form, including the name-status, which does not do detailed line-by-line analysis. On top of this the --follow works by iterating the revisions one by one, does a name-status diff with rename detection and outputs the revision if the file was modified and remembers the new (old) name when it was renamed.

Related

How to 'combine' histories of two files (one being the older 'version' of the other one)

I basically have what's a poor-man's versioning...
At one point someone copied / renamed the 'file.cs' to 'old-file.cs' - and all its history up to that point going with it.
And then created a new 'file.cs' - with all the new history going forward.
I ended up with the same file having history split up in between these two files.
I know this must be simple (if possible),
- I've tried searching, but my problem is how to 'phrase the question'
- This isn't a 'merge' (I think - I don't have branches involved),
- It's not the typical 'move' either
- I've looked up the tf command line but nothing resembles what I need
- I have the TFS Source Control Explorer Extension installed (but it can't really help with this)
FWIW, I'm using the VS 2015, C# project (both files are part of the same project), though I don't mind if the solution is command line 'tf' or whatever gets the job done.
So if anyone could help point me to the right direction at least it would be much appreciated. Thanks!
I have tested with TFS 2015.3 + VS 2015.3, but couldn't reproduce your scenario. In my test, the history in old file has been migrated to new file. You may check my steps to see whether they are the same as yours:
Rename a file gulpfile.js to old-gulpfile.js, and check it in in Source Control Explorer. Then copy old-gulpfile.js in workspace and modify it to gulpfile.js, and add it to source control and check it in.
Check old-gulpfile.js history:
Check gulpfile.js history:
You can see all history in old-gulpfile.js is also in new gulpfile.js file.

Git: File deleted locally but changed remotely. How do I see their changes?

On a reasonably large project I occasionally find myself in the following situation during a merge (I'm using the Windows GUI for Git called GitExtensions, but I'm equally comfortable with the command line):
File does not have a local revision. The file has been deleted locally (ours) but modified remotely (theirs).
I then get options to either: "Delete file (ours)", "Keep modified (theirs)", or "Keep base file".
I understand what has happened here and what these options are, but in most cases I don't know how to proceed unless I can see the changes that were made to the file on the branch that I'm merging in. If it was an unimportant change (whitespace / formatting / 'using' statements) then it doesn't matter and I can just keep the deletion, but if they made some more significant changes then I'm going to have to spend some time manually hunting these down and merging them.
Does anyone know of a git command I can run that will show me a diff of base -> remote in this situation? At the moment when this happens I'm going over to my colleague's desk to ask them what changes they made to that file so I can continue with my merge.
Well, it's not the best solution, but it's possibly the easiest. If you start the merge in Git (but before you act on it) Git will create a .BASE, .LOCAL and .REMOTE file in the same location as the file you need to merge.
In the case I've outlined above either .LOCAL or .REMOTE will be missing (depending on who deleted the file), but you can still go to the folder in question and manually diff either .BASE -> .REMOTE or .BASE -> .LOCAL.
This is probably easiest for Windows users.
If you really want to use the command line you can run
git merge-base <mergingBranch> HEAD
to find out the base commit hash. Then, using part of bundacia's answer above:
git difftool <mergeBase>..MERGE_HEAD -- foo.file
You can git a list of the changes made to the file on the remote with this command:
git log -p MERGE_HEAD -- foo
Explanation:
foo is the file in question.
MERGE_HEAD points to the HEAD of the remote branch you're merging with
-p causes log to print the diffs with each commit
I was successful using Git Gui. It came as part of msysgit and can be accessed from the Windows Explorer context menu.
At the point where git merge tells me to fix the conflicts and then commit, I opened Git Gui (see screenshot below).
In the upper-left corner, it shows all changes that are unstaged (because of conflicts). To the right of it, there's a diff of the file in question. It tells me that the local version has been deleted (first two lines), but also shows the remote changes (3rd-last line). I'm pretty happy with that!

How do I uniquely tag file and it's various versions?

I am working on a small application to allow me modify files and version each file before each change. What I would like the app to do is uniquely mark each file so that whenever the same file is opened up, the history for that particular file can be pulled back up. I am not using any of the big version control tools for this. How do I do this pro grammatically please?
Simple solution. Use a verison control which already exists (eg. Git) but if your really want to do this then try this.
Each time you create a new version copy the previous version of the file into a separate hidden directory and have a config file in that directory which holds the checksum of that file. Checksum will "more than likely" be unique since its a hashed value of the file (each time file changes, checksum will be different - you need to calculate the checksum yourself.)
When you open a file just check if there is that config file in the directory and compare the checksum with the checksum of what's already open. If they are the same then you are on the same file. That's how it works.
You could use checksums to optimise it. So if a user goes in to a file changes things, changes back to the way they were and saves. Checksum should return the same thing (unless you include modified date and time etc.)
Each folder should have a name which follows a pattern (filenameVn.n eg. someTextFile.txt.v1.0) then you will be able to figure out what the directory you are navigating to in the history should say.
Another approach would be to simply copy the file and append some tag onto the end of it (checksum maybe? version number?) so then you wouldn't need extra folders.
Yet another approach would be to call the files whatever the checksum recorded and store the history of versions (along with corresponding checksums) in a separate config file and then refer to it when you want to figure out what the file that you want to access is called. So each version will be refered to based on its own checksum (like in Git.)
So to sum up each file version would be stored somewhere, you will be able to validate if they are the same (so you can optimise by avoiding storing multiple files with no changes in them and wasting space) and you will be able to dynamically determine where each version is and get access to it.
Hope it gives you a bit more understanding of how to get started.

Differences between folders

I need to add a feature in my .NET app that allows users to understand which changes they made to configuration files. Every configuration is archived in folder. Eg.
ConfigV001
ConfigV002
...
ConfigV100
I think I could use git diff as follow:
git diff ConfigV001 ConfigV001
to get differences but then how can I format the output to obtain something like this? (the screenshot it taken from github-for-windows)
I would have the list of changes between the 2 versions (added, updated, removed files) and the changes for each file
Take a look at Diff.NET; includes source code for the utility and screenshots showing comparable behavior to what you're looking for.

TFS: Find a moved file by the old name

is there a way to get the version-history of a file if you only know an old name of the file?
I am currently looking at an old copy of our repository (I don't know the exact date, the copy was taken). When I compare it to the current repository, there is one file, that only exists in the copy, but not in the current repository. It has not been deleted in the repository. I guess, it has been moved or renamed. Is there any way in TFS to find the version-history using the old path and name?
I know that I could dig around using the name or some code-fragments. But IMO this is not an acceptable solution when using a repository :)
Thank you very much
Andreas
In Team Explorer 2010, you can simply turn on the "Show Deleted Files" option and navigate to the original folder, you'll be able to then see the file that was moved or deleted. You can view history on the item to see its last changeset - this will show you whether it was outright deleted, or if it was just renamed and thus the item no longer exists in the current path name (aka "slot") and was deleted that way. You can further drill down in to changeset details for that changeset to see the new path name (slot) that item occupies.
As you mention, you could certainly do this with a little bash against the TFS API using the GetItems method. Though I understand that it's not what you want to do, I thought it worth saying just because the TFS API is surprisingly easy to work with.
A couple of simple approaches (not already suggested in other answers) may help:
In your new repository, go to the folder that used to contain the old file, right click and show History. This will show all the versioned changes to files in that folder. Now look through the list of changes for files that no longer exist in the folder, and double click them to view them and determine if the file looks like an ancestorof your new file.
Or go for a brute force approach: get all the source code onto your disk and search for files of the same name, or files with some of the same text in them, as the file you're looking for (I'd look for comments that seem like they might be faily old and which use a distinctive wording that is unlikely to have appeared in many places. Comments are less likely to have changed than class/method names that might have been refactored if the file was renamed)
Grep may be an ugly, brute force way of approaching the problem, but sometimes it's the quickest and easiest. The TFS CLI tools are powerful, but unhelpful, complex and poorly documented, so unless you're already an expert, they can take a lot of trial and error to get them to do what you want.

Categories

Resources