Artificial intelligence training policy must be clear

April 10, 2026 | 10:00

(0) user say

In June 2025, a California federal court held that Anthropic’s use of millions of pirated copies of books to train its large language models was “transformative, spectacularly so”, a ruling that briefly steadied the AI industry. The respite was short-lived: the same judge then held that a transformative end use cannot cleanse an unlawful acquisition at the front end of the pipeline.

Artificial intelligence training policy must be clear

(L-R) Tran Manh Hung, managing partner, BMVN and Alex Do, senior tech executive, BMVN

Two months later, Anthropic settled for $1.5 billion, an object lesson that the legality of AI training turns on provenance, and that each stage of the pipeline carries its own exposure.

Vietnam is watching from a different starting point. It has no fair use doctrine and is building its AI training framework from scratch, across three separate legal regimes.

Model training is not a single act but a pipeline, with each stage carrying its own regulatory implications. Lawmaking has followed this functional split. The Ministry of Science and Technology holds the pen on the AI Law; the Ministry of Public Security oversees personal data protection; and the Ministry of Culture, Sports, and Tourism regulates copyright. As a result, it seems that each regulator exercises a different nuance of the full training pipeline.

Vietnam’s AI law, in force since March, explicitly defines a developer as someone who designs, builds, trains, tests, or fine-tunes all or part of an AI model, algorithm, or system. The law acknowledges training as a distinct activity. What it does not do is regulate that activity in any detail.

Substantive compliance obligations are weighted towards deployment. High-risk system providers must maintain risk management measures, administer training data to ensure quality, and retain technical documentation, all framed around the moment a system enters the market. Risk classification, transparency requirements, and accountability obligations attach primarily to systems being deployed and used.

For a developer who needs to know whether its training pipeline is compliant today, the AI law sets the frame but leaves the substance to other regimes.

The Law on Personal Data Protection (PDPL) and its implementing Decree No.356/2025/ND-CP are explicit but incomplete on AI training. Decree 356 permits organisations to use personal data to research and develop AI systems, but requires compliance with personal data protection regulations throughout. That trailing condition carries the legal weight.

Several tensions remain unresolved. The PDPL’s purpose limitation principle raises a secondary-use problem: data collected for one purpose, such as patient intake records or loan application data, then used to train a separate AI model, may require fresh consent that was never sought.

The framework treats de-identified data as falling outside the PDPL’s scope, which would simplify training pipelines considerably, but neither the PDPL nor Decree 356 defines what technical standard of de-identification is sufficient where re-identification risks from model behaviour are material and well-documented.

Decree 356 adds another complication. Where inference outputs can be used to identify a specific person, those outputs are themselves treated as personal data. A model trained on lawfully de-identified data can, if it memorises or reconstructs personal patterns, generate outputs that fall back within the PDPL’s perimeter. Who bears accountability for that outcome, and when, is not specified.

The upcoming decree amending Decree No.17/2023/ND-CP proposes a text-and-data mining exception implementing a recent amendment to the Intellectual Property Law. Because it remains a draft, its provisions may yet change, but the current text raises notable questions.

The amended law permits use of lawfully accessible works for AI training if such use does not unreasonably prejudice rights holders, without drawing a commercial or non-commercial distinction. Yet, the draft decree limits the exception to non-commercial purposes. If that reading survives, the decree would be more restrictive than the legislation it implements.

The draft also requires that AI system outputs not displace the market for the underlying training data, conflating two analytically separate questions: whether training data was lawfully used as input, and whether model outputs infringe copyright as output. Tying the legality of a training act to the character of future model outputs creates a practically unverifiable condition when training occurs.

Whether these provisions survive the drafting process remain to be seen, but their presence signals uncertainty that developers cannot yet plan around.

None of this operates in isolation, and the intersections are where the hardest compliance questions arise. A dataset satisfying the copyright exception may still contain personal data requiring a separate lawful basis under the PDPL. De-identifying data for PDPL purposes says nothing about the copyright status of the underlying content.

Meanwhile, the draft Decree 17 amendment requires trainers to maintain technical records in accordance with the AI Law, while that law’s regulations on training-stage documentation remain unissued. A Vietnamese AI developer assembling training data today faces exposure across all three regimes simultaneously, with no single authority positioned to confirm its pipeline is clean.

The US experience offers a reference point rather than a standardised template. It would be a mistake to assume that approaches developed in the West should automatically be treated as universal models.

Vietnam is doing what jurisdictions building AI law in real time are increasingly forced to do: legislating ahead of full technical and commercial understanding, at speed, under competitive pressure, and with the knowledge that any framework adopted will be tested by factual scenarios no one can yet anticipate. That is the condition under which this law is being made.

Investors and developers operating in Vietnam need to understand this context clearly. Investment in Vietnam, particularly where digital transformation, AI, and intellectual property assets form the backbone of business models, is being actively encouraged by the government.

There is a discernible shift towards Vietnam as a destination for high tech investment. Vietnam’s legislative and policy efforts reflect an ambition to position itself as a regional hub for innovation anchored in sustainability, legal certainty, and long-term growth.

By Manh Hung and Alex Do

What the stars mean:

★ Poor ★ ★ Promising ★★★ Good ★★★★ Very good ★★★★★ Exceptional