Fine-tune Falcon 7B and other LLMs on Amazon SageMaker with @remote decorator

At the moment, generative AI fashions cowl quite a lot of duties from textual content summarization, Q&A, and picture and video technology. To enhance the standard of output, approaches like n-short studying, Immediate engineering, Retrieval Augmented Technology (RAG) and advantageous tuning are used. Superb-tuning lets you alter these generative AI fashions to attain improved efficiency in your domain-specific duties.

With Amazon SageMaker, now you possibly can run a SageMaker coaching job just by annotating your Python code with @distant decorator. The SageMaker Python SDK mechanically interprets your current workspace setting, and any related information processing code and datasets, into an SageMaker coaching job that runs on the coaching platform. This has the benefit of writing the code in a extra pure, object-oriented means, and nonetheless makes use of SageMaker capabilities to run coaching jobs on a distant cluster with minimal adjustments.

On this submit, we showcase the right way to fine-tune a Falcon-7B Basis Fashions (FM) utilizing @distant decorator from SageMaker Python SDK. It additionally makes use of Hugging Face’s parameter-efficient fine-tuning (PEFT) library and quantization strategies by means of bitsandbytes to assist fine-tuning. The code introduced on this weblog can be used to fine-tune different FMs, resembling Llama-2 13b.

The total precision representations of this mannequin might need challenges to suit into reminiscence on a single and even a number of Graphic Processing Models (GPUs) — or might even want a much bigger occasion. Therefore, with a view to fine-tune this mannequin with out rising price, we use the approach referred to as Quantized LLMs with Low-Rank Adapters (QLoRA). QLoRA is an environment friendly fine-tuning method that reduces reminiscence utilization of LLMs whereas sustaining superb efficiency.

Benefits of utilizing @distant decorator

Earlier than going additional, let’s perceive how distant decorator improves developer productiveness whereas working with SageMaker:

@distant decorator triggers a coaching job instantly utilizing native python code, with out the specific invocation of SageMaker Estimators and SageMaker enter channels
Low barrier for entry for builders coaching fashions on SageMaker.
No want to change Built-in improvement environments (IDEs). Proceed writing code in your selection of IDE and invoke SageMaker coaching jobs.
No have to study containers. Proceed offering dependencies in a necessities.txt and provide that to distant decorator.

Stipulations

An AWS account is required with an AWS Identification and Entry Administration (AWS IAM) function that has permissions to handle assets created as a part of the answer. For particulars, consult with Creating an AWS account.

On this submit, we use Amazon SageMaker Studio with the Information Science 3.0 picture and a ml.t3.medium quick launch occasion. Nonetheless, you should use any built-in improvement setting (IDE) of your selection. You simply have to arrange your AWS Command Line Interface (AWS CLI) credentials accurately. For extra data, consult with Configure the AWS CLI.

For fine-tuning, the Falcon-7B, an ml.g5.12xlarge occasion is used on this submit. Please guarantee enough capability for this occasion in AWS account.

You want to clone this Github repository for replicating the answer demonstrated on this submit.

Resolution overview

Set up pre-requisites to advantageous tuning the Falcon-7B mannequin
Arrange distant decorator configurations
Preprocess the dataset containing AWS providers FAQs
Superb-tune Falcon-7B on AWS providers FAQs
Take a look at the fine-tune fashions on pattern questions associated to AWS providers

1. Set up conditions to advantageous tuning the Falcon-7B mannequin

Launch the pocket book falcon-7b-qlora-remote-decorator_qa.ipynb in SageMaker Studio by choosing the Picture as Information Science and Kernel as Python 3. Set up all of the required libraries talked about within the necessities.txt. Few of the libraries must be put in on the pocket book occasion itself. Carry out different operations wanted for dataset processing and triggering a SageMaker coaching job.

%pip set up -r necessities.txt

%pip set up -q -U transformers==4.31.0
%pip set up -q -U datasets==2.13.1
%pip set up -q -U peft==0.4.0
%pip set up -q -U speed up==0.21.0
%pip set up -q -U bitsandbytes==0.40.2
%pip set up -q -U boto3
%pip set up -q -U sagemaker==2.154.0
%pip set up -q -U scikit-learn

2. Setup distant decorator configurations

Create a configuration file the place all of the configurations associated to Amazon SageMaker coaching job are specified. This file is learn by @distant decorator whereas operating the coaching job. This file accommodates settings like dependencies, coaching picture, occasion, and the execution function for use for coaching job. For an in depth reference of all of the settings supported by config file, try Configuring and utilizing defaults with the SageMaker Python SDK.

SchemaVersion: ‘1.0’
SageMaker:
PythonSDK:
Modules:
RemoteFunction:
Dependencies: ./necessities.txt
ImageUri: ‘{aws_account_id}.dkr.ecr.{area}.amazonaws.com/huggingface-pytorch-training:2.0.0-transformers4.28.1-gpu-py310-cu118-ubuntu20.04’
InstanceType: ml.g5.12xlarge
RoleArn: arn:aws:iam::111122223333:function/ExampleSageMakerRole

It’s not obligatory to make use of the config.yaml file with a view to work with the @distant decorator. That is only a cleaner method to provide all configurations to the @distant decorator. This retains SageMaker and AWS associated parameters exterior of code with a one time effort for establishing the config file used throughout the staff members. All of the configurations may be provided instantly within the decorator arguments, however that reduces readability and maintainability of adjustments in the long term. Additionally, the configuration file might be created by an administrator and shared with all of the customers in an setting.

Preprocess the dataset containing AWS providers FAQs

Subsequent step is to load and preprocess the dataset to make it prepared for coaching job. First, allow us to take a look on the dataset:

It reveals FAQ for one of many AWS providers. Along with QLoRA, bitsanbytes is used to transform to 4-bit precision to quantize frozen LLM to 4-bit and fix LoRA adapters on it.

Create a immediate template to transform every FAQ pattern to a immediate format:

from random import randint

# customized instruct immediate begin
prompt_template = f”{{query}}n—nAnswer:n{{reply}}{{eos_token}}”

# template dataset so as to add immediate to every pattern
def template_dataset(pattern):
pattern[“text”] = prompt_template.format(query=pattern[“question”],
reply=pattern[“answers”],
eos_token=tokenizer.eos_token)
return pattern

Subsequent step is to transform the inputs (textual content) to token IDs. That is performed by a Hugging Face Transformers Tokenizer.

from transformers import AutoTokenizer

model_id = “tiiuae/falcon-7b”

tokenizer = AutoTokenizer.from_pretrained(model_id)
# Set the Falcon tokenizer
tokenizer.pad_token = tokenizer.eos_token

Now merely use the prompt_template operate to transform all of the FAQ to immediate format and arrange practice and take a look at datasets.

4. Superb tune Falcon-7B on AWS providers FAQs

Now you possibly can put together the coaching script and outline the coaching operate train_fn and put @distant decorator on the operate.

The coaching operate does the next:

tokenizes and chunks the dataset
arrange BitsAndBytesConfig, which specifies the mannequin must be loaded in 4-bit however whereas computation must be transformed to bfloat16.
Load the mannequin
Discover goal modules and replace the required matrices through the use of the utility methodology find_all_linear_names
Create LoRA configurations that specify rating of replace matrices (s), scaling issue (lora_alpha), the modules to use the LoRA replace matrices (target_modules), dropout chance for Lora layers(lora_dropout), task_type, and so on.
Begin the coaching and analysis

import bitsandbytes as bnb

def find_all_linear_names(hf_model):
lora_module_names = set()
for identify, module in hf_model.named_modules():
if isinstance(module, bnb.nn.Linear4bit):
names = identify.cut up(“.”)
lora_module_names.add(names[0] if len(names) == 1 else names[-1])

if “lm_head” in lora_module_names:
lora_module_names.take away(“lm_head”)
return checklist(lora_module_names)
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training
from sagemaker.remote_function import distant
import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
import transformers

# Begin coaching
@distant(volume_size=50)
def train_fn(
model_name,
train_ds,
test_ds,
lora_r=8,
lora_alpha=32,
lora_dropout=0.05,
per_device_train_batch_size=8,
per_device_eval_batch_size=8,
learning_rate=2e-4,
num_train_epochs=1
):
# tokenize and chunk dataset
lm_train_dataset = train_ds.map(
lambda pattern: tokenizer(pattern[“text”]), batched=True, batch_size=24, remove_columns=checklist(train_dataset.options)
)

lm_test_dataset = test_ds.map(
lambda pattern: tokenizer(pattern[“text”]), batched=True, remove_columns=checklist(test_dataset.options)
)

# Print complete variety of samples
print(f”Whole variety of practice samples: {len(lm_train_dataset)}”)

bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type=”nf4″,
bnb_4bit_compute_dtype=torch.bfloat16
)
# Falcon requires you to permit distant code execution. It’s because the mannequin makes use of a brand new structure that’s not a part of transformers but.
# The code is offered by the mannequin authors within the repo.
mannequin = AutoModelForCausalLM.from_pretrained(
model_name,
trust_remote_code=True,
quantization_config=bnb_config,
device_map=”auto”)

mannequin.gradient_checkpointing_enable()
mannequin = prepare_model_for_kbit_training(mannequin, use_gradient_checkpointing=True)

# get lora goal modules
modules = find_all_linear_names(mannequin)
print(f”Discovered {len(modules)} modules to quantize: {modules}”)

config = LoraConfig(
r=lora_r,
lora_alpha=lora_alpha,
target_modules=modules,
lora_dropout=lora_dropout,
bias=”none”,
task_type=”CAUSAL_LM”
)

mannequin = get_peft_model(mannequin, config)
print_trainable_parameters(mannequin)

coach = transformers.Coach(
mannequin=mannequin,
train_dataset=lm_train_dataset,
eval_dataset=lm_test_dataset,
args=transformers.TrainingArguments(
per_device_train_batch_size=per_device_train_batch_size,
per_device_eval_batch_size=per_device_eval_batch_size,
logging_steps=2,
num_train_epochs=num_train_epochs,
learning_rate=learning_rate,
bf16=True,
save_strategy=”no”,
output_dir=”outputs”
),
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, multi level marketing=False),
)
mannequin.config.use_cache = False

coach.practice()
coach.consider()

mannequin.save_pretrained(“/decide/ml/mannequin”)

And invoke the train_fn()

train_fn(model_id, train_dataset, test_dataset)

The tuning job can be operating on the Amazon SageMaker coaching cluster. Await tuning job to complete.

5. Take a look at the advantageous tune fashions on pattern questions associated to AWS providers

Now, it’s time to run some assessments on the mannequin. First, allow us to load the mannequin:

from peft import PeftModel, PeftConfig
import torch
from transformers import AutoModelForCausalLM

system=”cuda” if torch.cuda.is_available() else ‘mps’ if torch.backends.mps.is_available() else ‘cpu’

config = PeftConfig.from_pretrained(“./mannequin”)
mannequin = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, trust_remote_code=True)
mannequin = PeftModel.from_pretrained(mannequin, “./mannequin”)
mannequin.to(system)

Now load a pattern query from the coaching dataset to see the unique reply after which ask the identical query from the tuned mannequin to see the reply as compared.

Here’s a pattern a query from coaching set and the unique reply:

Now, identical query being requested to tuned Falcon-7B mannequin:

This concludes the implementation of advantageous tuning Falcon-7B on AWS providers FAQ dataset utilizing @distant decorator from Amazon SageMaker Python SDK.

Cleansing up

Full the next steps to scrub up your assets:

Shut down the Amazon SageMaker Studio situations to keep away from incurring further prices.
Clear up your Amazon Elastic File System (Amazon EFS) listing by clearing the Hugging Face cache listing:

rm -R ~/.cache/huggingface/hub

Conclusion

On this submit, we confirmed you the right way to successfully use the @distant decorator’s capabilities to fine-tune Falcon-7B mannequin utilizing QLoRA, Hugging Face PEFT with bitsandbtyes with out making use of vital adjustments within the coaching pocket book, and used Amazon SageMaker capabilities to run coaching jobs on a distant cluster.

All of the code proven as a part of this submit to fine-tune Falcon-7B is obtainable within the GitHub repository. The repository additionally accommodates pocket book displaying the right way to fine-tune Llama-13B.

As a subsequent step, we encourage you to take a look at the @distant decorator performance and Python SDK API and use it in your selection of setting and IDE. Further examples can be found within the amazon-sagemaker-examples repository to get you began rapidly. You may as well try the next posts:

Concerning the Authors

Bruno Pistone is an AI/ML Specialist Options Architect for AWS primarily based in Milan. He works with giant clients serving to them to deeply perceive their technical wants and design AI and Machine Studying options that make the perfect use of the AWS Cloud and the Amazon Machine Studying stack. His experience embody: Machine Studying finish to finish, Machine Studying Industrialization, and Generative AI. He enjoys spending time along with his buddies and exploring new locations, in addition to travelling to new locations.

Vikesh Pandey is a Machine Studying Specialist Options Architect at AWS, serving to clients from monetary industries design and construct options on generative AI and ML. Outdoors of labor, Vikesh enjoys making an attempt out completely different cuisines and taking part in outside sports activities.

Source link

Fine-tune Falcon 7B and other LLMs on Amazon SageMaker with @remote decorator

Eliminating Vector Quantization: Diffusion-Based Autoregressive AI Models for Image Generation

Voyage Multilingual 2 Embedding Evaluation | by Lars Wiik | Jun, 2024

Eric Evans receives Department of Defense Medal for Distinguished Public Service | MIT News

Tim Davis, Co-Founder & President of Modular – Interview Series

Build a classification pipeline with Amazon Comprehend custom classification (Part I)

Recommended For You

Eliminating Vector Quantization: Diffusion-Based Autoregressive AI Models for Image Generation

Voyage Multilingual 2 Embedding Evaluation | by Lars Wiik | Jun, 2024

Eric Evans receives Department of Defense Medal for Distinguished Public Service | MIT News

Imperva optimizes SQL generation from natural language using Amazon Bedrock

AI in Manufacturing: Overcoming Data and Talent Barriers

Build a classification pipeline with Amazon Comprehend custom classification (Part I)

Agility Robotics’ Melonee Wise on how humanoids can fill automation gaps

On-device content distillation with graph neural networks – Google Research Blog

Leave a Reply Cancel reply

A technique for more effective multipurpose robots | MIT News

Helping robots grasp the unpredictable | MIT News

The Current State of AI! (My Personal News Recap)

2024 World Battery & Energy Storage Industry Expo (WBE)

MIT faculty, instructors, students experiment with generative AI in teaching and learning | MIT News

Robotics investments reach $418M in November 2023

What is AI – Artificial Intelligence in Telugu | Future of AI | TeluguBadi

Zion Solutions Group Joins Forces with Locus Robotics to Supercharge Warehouse Productivity

A method to enable safe mobile robot navigation in dynamic environments

Robot Talk Episode 90 – Robotically Augmented People

Eliminating Vector Quantization: Diffusion-Based Autoregressive AI Models for Image Generation

RBR50 Spotlight: Slip Robotics minimizes trailer loading times with simple approach

Voyage Multilingual 2 Embedding Evaluation | by Lars Wiik | Jun, 2024

CATEGORIES

SITE MAP

Welcome Back!

Retrieve your password

Fine-tune Falcon 7B and other LLMs on Amazon SageMaker with @remote decorator

You might also like

Benefits of utilizing @distant decorator

Stipulations

Resolution overview

1. Set up conditions to advantageous tuning the Falcon-7B mannequin

2. Setup distant decorator configurations

Preprocess the dataset containing AWS providers FAQs

4. Superb tune Falcon-7B on AWS providers FAQs

5. Take a look at the advantageous tune fashions on pattern questions associated to AWS providers

Cleansing up

Conclusion

Concerning the Authors

Tim Davis, Co-Founder & President of Modular – Interview Series

Build a classification pipeline with Amazon Comprehend custom classification (Part I)

Recommended For You

Leave a Reply Cancel reply

CATEGORIES

SITE MAP

Welcome Back!

Retrieve your password