fix: Add V100 (older) GPU Support for Mistral 7b Models #1279

xihajun · 2023-11-23T17:51:50Z

What does this PR do?

Introduction

This PR introduces changes that significantly improve support for running Mistral 7b models on V100 GPU architectures. By update the latest transformers package, we ensure both compatibility and performance for users with V100 GPUs.

Changes

Altered the model instantiation logic in the scenario where the model type is 'mistral'.
In environments where FLASH_ATTENTION is not available, the system defaults to using CausalLM. This ensures that Mistral models remain operational even without Flash Attention support.

Impact

These updates are particularly beneficial for users with V100 GPU architectures, which previously faced compatibility issues due to flash-attention package limitations. With these changes, we can expand our hardware support, reduce barriers to entry, and streamline the user experience for a significant segment of our user base.

Additional Notes

The transformers package has been updated to version 4.35.2, which now includes support for Mistral models. This update is essential since the flash-attention package does not support V100 GPUs, as discussed in Dao-AILab/flash-attention#148.

Testing

Comprehensive tests have been conducted to ensure that Mistral models perform as expected on V100 GPUs.
Additional benchmarks have been added to compare the performance on V100 GPUs with previous versions.

I invite reviewers to pull this branch, test rigorously with V100 GPUs, and provide feedback on any aspects of this enhancement.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
NotImplementedError: Mistral model requires flash attention v2 #1253
NotImplementedError: Mistral model requires flash attention v2 #1208
V100 raiseError that may be too restrictive #319
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@abhijithnair1

Modified the instantiation logic for the 'mistral' model type to support V100 GPU architectures. If FLASH_ATTENTION is not available, the code falls back to the generic CausalLM, ensuring functionality regardless of the underlying hardware. This change circumvents the incompatibility issue with the flash-attention package on V100 GPUs.

Close #1253 Close #1279

Close huggingface#1253 Close huggingface#1279

xihajun and others added 5 commits November 23, 2023 16:34

Test on mistral with CausalLM

f100449

Test for a different version name

31a7e9e

Update tokenizers==0.15.0 and transformers==4.35.2

d856888

Revert back to version 1.1.1

2b9b040

xihajun changed the title ~~Enhance V100 (older) GPU Support for Mistral 7b Models~~ fix: Add V100 (older) GPU Support for Mistral 7b Models Nov 27, 2023

xihajun and others added 2 commits November 28, 2023 12:22

Rename requirements.txt to requirements_cuda.txt

d81f0bb

Merge branch 'main' into main

2144a68

kno10 mentioned this pull request Dec 12, 2023

NotImplementedError: Mistral model requires flash attention v2 #1253

Closed

4 tasks

OlivierDehaene mentioned this pull request Dec 15, 2023

feat: relax mistral requirements #1351

Merged

OlivierDehaene closed this in #1351 Dec 15, 2023

OlivierDehaene added a commit that referenced this pull request Dec 15, 2023

feat: relax mistral requirements (#1351)

9b56d3f

Close #1253 Close #1279

pangyiwei mentioned this pull request Jan 5, 2024

Unable to deploy huggingface-llm 1.3.3 aws/sagemaker-python-sdk#4332

Closed

kdamaszk pushed a commit to kdamaszk/tgi-gaudi that referenced this pull request Apr 29, 2024

feat: relax mistral requirements (huggingface#1351)

a95e6d6

Close huggingface#1253 Close huggingface#1279

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Add V100 (older) GPU Support for Mistral 7b Models #1279

fix: Add V100 (older) GPU Support for Mistral 7b Models #1279

Uh oh!

xihajun commented Nov 23, 2023 •

edited

Loading

Uh oh!

Uh oh!

fix: Add V100 (older) GPU Support for Mistral 7b Models #1279

fix: Add V100 (older) GPU Support for Mistral 7b Models #1279

Uh oh!

Conversation

xihajun commented Nov 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Introduction

Changes

Impact

Additional Notes

Testing

Before submitting

Who can review?

Uh oh!

Uh oh!

xihajun commented Nov 23, 2023 •

edited

Loading