PersonalizationJune 28, 2026

Personyze’s New A/B Testing UI: Easier Setup, Auto-Pick Winners, and Clearer Results

Personyze TeamPersonalization experts

We rebuilt the Personyze A/B testing experience end to end — from the screen where you choose who’s in a test, to the screen where you set the variants up, to the screen where you read the results. The goal was to make rigorous testing feel obvious: target the right audience visually, split your traffic with a slider, let Personyze pick a winner only when the statistics actually support it, and read every variation against a real control. This post walks through what changed, and you can click around three live demos of the new UI — targeting, setup, and results — right here on the page.

The redesigned results view every variation measured against a control with lift and statistical confidence front and center

What’s new at a glance

A redesigned visual targeting builder with AND / OR / XOR logic, nested groups, an AI assistant, and a live audience forecast.
A visual traffic-split slider with a built-in holdout (control) group.
Clear assignment methods — decide whether a visitor sees the same variation every time or a fresh one each visit.
Multivariate testing with a live sample-size estimate, so you know what each combination costs in traffic.
Auto-pick winner that only fires when significance, sample size, and run-time all check out.
A redesigned results dashboard with per-KPI tabs, segment breakdowns, delivery health, and a plain-English recommendation.

Targeting: choose who sees the test

Before you decide what to test, you decide who is in it — and that screen got the biggest redesign. Targeting is now a visual rule tree you build from simple conditions, with the logic laid out the way you’d actually reason about it.

Rules from plain conditions. Each rule is one check — “URL contains /checkout”, “Country is Netherlands”, a CRM field, device, even local weather.
Include or exclude, per condition. Mix them inside one group — “include /blog but exclude /blog/admin” — without splitting it into two rules.
AND / OR / XOR. Between every two groups sits an operator pill; click it to change how their results combine.
Nested groups with parentheses. Wrap rules so their logic doesn’t bleed across the tree — to any depth, with each level color-coded (green → violet → amber) so brackets never blur together.
Drag to reorder. Grab a rule by its handle and drop it onto any connector — in or out of a nested group — and the connectors heal automatically.
An AI assistant. Type the audience in plain English — “Dutch mobile visitors who haven’t subscribed” — and it builds the rule tree for you, to keep or append to.
A live audience forecast. The sidebar updates as you edit: match percentage, raw counts, and a pie of the last 90 minutes of real traffic — so you size the audience before launching, and can test the rules against live sessions.

It’s the same targeting that powers every personalized experience in Personyze, not just tests — closely related to behavioral targeting. Build an audience here:

Live demo · Visitor targeting — add rules, switch AND/OR, try the AI assistant

Open the full demo in a new tab ↗

The test builder

With the audience set, the builder separates two ideas that are easy to confuse. Campaign content is shown to everyone who matches your targeting, on top of whichever group they’re in — it’s layered, not part of the test. Test groups are the variants that actually split the audience; each one is measured independently against the Site original baseline.

From there you control the test with a few deliberate choices:

Traffic split. Drag handles on the bar or type an exact percentage into any group. Give Site original 10–20% as a control to answer “did this beat doing nothing?”, or set it to 0% to put your whole audience in the test.
Assignment method (“Assign by”). Rotate users keeps each visitor in the same group across visits — the right default for funnel and checkout tests. Visits reshuffles on every visit, which suits short-lived banner or headline tests. There are also cross-campaign control cohorts when you want the same permanent holdout across every test you run.
Multivariate combinations. Combinations multiply fast, and each one needs traffic to reach significance — the builder shows a sample-size estimate so you can keep the test winnable.

Here’s the builder itself — drag the split, switch the assignment method, and open How testing works in the side panel:

Live demo · A/B test builder — drag the traffic split, try “Assign by”

Open the full demo in a new tab ↗

Auto-pick winners — without the foot-guns

The fastest way to get a wrong answer in testing is to stop the moment a variation looks ahead. Auto-pick winner is built to prevent exactly that: Personyze will only declare and deploy a winner when three conditions are all true.

Auto pick winner only triggers when significance sample size and minimum duration are all satisfied

Statistical significance. 95% is the industry standard; 90% reaches a call faster but accepts more risk of a false positive.
Minimum sample size. Enough visitors must enter the test before any winner counts — raise it on low-traffic sites to avoid noisy early reads.
Minimum duration. A floor on run-time, ideally at least a week, so weekday and weekend behavior are both represented.

When the call is made — automatically by Auto-pick, or by you from the dashboard’s recommendation — deploying is one action: Personyze applies the recommended winning settings to the campaign and saves them, so the winning variation becomes the live experience for everyone who matches your targeting.

For time-sensitive tests, Auto-adjust rotation shifts more traffic toward the leader as data accrues — faster to a result, at the cost of some statistical purity. It’s a deliberate trade-off, clearly labeled.

The new results dashboard

The results screen is organized around your goals. A tab for the primary goal (say, trial signups) decides the winner; secondary goals ride along for context so you can catch side-effects — a variation that lifts signups but quietly tanks demo requests; and supporting KPIs like bounce rate, click-through, and time on site explain why a variation is winning or losing.

Real lift, not vanity lift. Every variation is compared to a control holdout, so the number you read is lift versus doing nothing — calculated as (variation − control) ÷ control. In the demo, a 4.99% rate against a 3.94% control is a +26.6% lift.
Conclusive at 95%. Confidence comes from a two-sided z-test on conversion rates. Below the threshold, the leader is just leading — not winning.
Estimated time to significance. A pill on each KPI projects forward at your current traffic, so you know whether a result is days or weeks away.
Delivery health. Per-variation checks so a broken or under-served variation doesn’t quietly skew the test.

The breakdown panel is where testing meets personalization. Split the primary goal by device, country, city, URL, or browser, and each slice becomes its own mini-test. When a variation wins overall but a different one wins on mobile, that’s your cue to personalize by segment rather than ship a single global winner.

Once winning experiences have been identified, the next step is refining how they’re presented to users across different devices and audiences. Using modern UI design tools can help teams rapidly prototype interface improvements, validate design ideas, and iterate on layouts before rolling successful test variations into production. This creates a smoother workflow between experimentation and implementation.

Click through the tabs, open a breakdown, and read the built-in docs:

Live demo · A/B test results — switch KPI tabs, open the segment breakdown

Open the full demo in a new tab ↗

Honest testing, baked in

A few principles the new UI nudges you toward, whether you automate the call or make it yourself:

Don’t peek and call. Stopping the instant confidence first touches 95% inflates false positives — let the minimums protect you.
Test one thing at a time. If variations differ in layout, color, and copy at once, a winner won’t tell you why it won.
Use a control when it matters. If you need to know you beat doing nothing, keep a holdout.
Cover a full week. Always run long enough to capture weekday and weekend cycles.

Where it fits

A/B testing isn’t a silo in Personyze — it’s part of the complete personalization platform, sharing the same visitor profile and the same targeting as your recommendations, popups and banners, and behavioral targeting. Test a change, find the winner, and when the segments disagree, turn that finding into a personalized experience instead of a one-size-fits-all rollout. You can start from ready-made templates, and the full setup is documented in the A/B testing wizard guide.

New A/B testing UI: FAQ

Who can use the new A/B testing UI right now?

The redesigned A/B testing experience is currently available to selected customers who opted in to the new interface. A complete rollout to all Personyze accounts is coming soon.

What’s new in Personyze’s A/B testing UI?

Three screens were redesigned: targeting, test setup, and results. Targeting is now a visual rule builder with AND/OR/XOR logic, nested groups, an AI assistant, and a live audience forecast. Setup adds a traffic-split slider with a control holdout, clearer assignment methods, multivariate combinations, and auto-pick winner. The results dashboard adds per-KPI tabs, lift against a control, 95% significance, segment breakdowns, and delivery health.

How does the new targeting builder work?

You build a tree of simple rules — conditions like ‘URL contains /checkout’ or ‘Country is Netherlands’ — set each condition to include or exclude, and combine groups with AND, OR, or XOR. Groups can be nested to any depth with color-coded brackets, rules can be dragged to reorder, an AI assistant turns plain-English audiences into rules, and a live forecast shows what share of recent traffic matches.

How does automatic winner selection work?

Personyze continuously evaluates each variation against your primary goal and calculates statistical confidence with a two-sided z-test. Once three thresholds are all met – your significance level (95% is standard), a minimum sample size, and a minimum run duration – the leading variation is marked as the winner. Until all three are met, the leader is only leading, not winning.

How does automated deployment work?

Deploying a winner is one action. Personyze applies the recommended winning configuration to your campaign and saves it, so that variation becomes the live experience for everyone who matches your targeting. With Auto-pick enabled this happens automatically as soon as the result is conclusive; with it off, you click Deploy on the dashboard’s recommendation when you’re ready.

Can I make the final call myself instead of auto-deploying?

Yes. Auto-pick is optional. Leave it off and the dashboard still surfaces a recommendation with the supporting metrics, so you can review it and deploy the winner manually whenever you’re satisfied with the result.

When is an A/B test result conclusive?

A KPI becomes conclusive once statistical confidence reaches 95%, based on a two-sided z-test on conversion rates. Below that, the leading variation is just leading, not winning. As a rule of thumb, run at least 14 days and let significance hold for several days before acting.

Can I see A/B test results by segment, like device or country?

Yes. The breakdown panel splits your primary goal by device, country, city, URL, or browser. Each segment is treated as its own mini-test that needs 95% confidence to be conclusive — useful for spotting when you should personalize by segment instead of shipping one global winner.

Do I need code or a developer to run a test?

No. Targeting, traffic split, variants, goals, and deployment are all point-and-click in the Personyze UI – no code is required to build, run, or ship a test.

Want to run tests like this on your own site? Start free or book a demo and we’ll walk you through it.

Let's talk

Book a demo with a personalization expert

30 minutes with a personalization expert. Bring your stack, your goals, your skepticism. We'll show you what changes when every visit feels like the only one.

Book a demo →Start Free