DVC Bugfix: securely pull data from azure blob backend

in utopian-io •  6 years ago 

Bug Fix: enable DVC to pull data from Azure blob in a more secure way

DVC is a "Data Version Control" system. It keeps your actual data files tucked away on suitable media, such as cloud-based blob storage, while keeping the code naturally in sync with the data, using git. This way you get to eat the cake (sync large, binary, data files with your code) while having it too (keeping the repository small and fast-performing).

Since the code and data are related, automatic CI pipelines can rely on such data and even test and validate it. However, for security reasons, we typically want to run the CI pipelines with reduced credentials, rather than give them full control of the cloud account that was used as storage backend. It turned out that for the case of the Azure blob backend, such reduced-credentials accounts had caused DVC to choke, as described in this issue, effectively preventing the usage of that security measure.

Fortunately, some digging revealed the root cause for the issue, and the workaround I suggested in this PR was quickly reviewed and merged.

  • What was the issue?

In Azure, the common way to authenticate automated tasks to the blob storage is by using SAS tokens instead of the full credentials. However, when trying to use such a token for a dvc pull command in the CI pipeline resulted in an error message, which said "This request is not authorized to perform this operation." (despite the fact that the SAS token did have read permission).

  • The root cause:

It turned out that during the initialization of the blob_service object, the existing implementation had always attempted to create the container on the blob. This is useful, e.g. for the first usage of dvc push command, when the container did not exist before, and it does nothing if the container already exists. However, if you are only trying to pull, and you have only read permissions, it fails.

  • What was the solution?

Instead of always trying to create, we should first try to check if the folder exists. If it does, no reason for the pull command to fail. If it does not, it will fail with a more comprehensible error message, and if you are indeed trying to push, you actually need write permissions, so the error would be expected.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

Welcome to STEEM blockchain and Utopian ecosystem!

The contribution post is clear enough to define the problem and the solution. I see that the project owner already merged the changeset into the codebase.

Even though we love contributions like this, we have a set of pre-defined question to determine scores of the contributions. You can see the relevant questions for each category here.

That being said, this contribution got a low score on the total volume of the work. Keep in mind for your future submissions. :)


Your contribution has been evaluated according to Utopian policies and guidelines, as well as a predefined set of questions pertaining to the category.

To view those questions and the relevant answers related to your post, click here.


Need help? Chat with us on Discord.

[utopian-moderator]

Thanks for the fair review :-)

Thank you for your review, @emrebeyler! Keep up the good work!

Congratulations @amitar! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

You received more than 50 upvotes. Your next target is to reach 100 upvotes.

You can view your badges on your Steem Board and compare to others on the Steem Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Do not miss the last post from @steemitboard:

The Steem blockchain survived its first virus plague!
Vote for @Steemitboard as a witness to get one more award and increased upvotes!

Hey, @amitar!

Thanks for contributing on Utopian.
We’re already looking forward to your next contribution!

Get higher incentives and support Utopian.io!
Simply set @utopian.pay as a 5% (or higher) payout beneficiary on your contribution post (via SteemPlus or Steeditor).

Want to chat? Join us on Discord https://discord.gg/h52nFrV.

Vote for Utopian Witness!

Congratulations @amitar! You received a personal award!

Happy Birthday! - You are on the Steem blockchain for 1 year!

You can view your badges on your Steem Board and compare to others on the Steem Ranking

Vote for @Steemitboard as a witness to get one more award and increased upvotes!